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Studies in Industrial Vision 


I. The Validity of Lateral Phoria Measurements in the 
Ortho-Rater ' 


S. Edgar Wirt 


Purdue University 


The investigations here reported were undertaken to determine the 
validity of lateral phoria measurements at the optical equivalent of eight 
meters distance in the Bausch and Lomb Ortho-Rater,? a precision stereo- 
scope providing a battery of vision tests for rapid use in industry. Fry 
(1) reported discrepancies between phoria measures in the Keystone 
Telebinocular* (another type of stereoscope) and at true distance. 
Neumueller (3) has questioned the validity of phoria measurements in 
any stereoscope on the basis of the lack of correspondence between 
phorias he measured in a simple hand stereoscope and also by another 
method at a true distance. Results from these different methods, also 
reported here for comparison, indicate some of the factors that might 
invalidate stereoscope phoria tesis and indicate the specifications for 
phoria tests in a stereoscope that will be valid and suitable for use in 
industry. 


Significance and Interpretation of Phoria Tests 


The usefulness of visual phoria tests for classification and placement 
of industrial workers has been shown by Tiffin (7). Measurement of 
visual phorias of industrial employees and applicants has been recom- 
mended by leaders in the optical professions (see Snell, 6) and adopted 
by some industries as one part of a visual testing routine for determining 


1 This article is based on the author’s thesis, ‘‘Certain Factors in the Measurement 
of Visual Phorias,” submitted to the Faculty of Purdue University in partial fulfillment 
of the requirements for the degree of Doctor of Philosophy, May, 1942. These studies 
were part of an extensive investigation of visual problems in industry directed by 
Professor Joseph Tiffin in collaboration with the Bausch and Lomb Optical Company, 
Rochester, New York. Subsequent reports will cover other phases of this investigation. 

* Bausch and Lomb Occupational Vision Tests, the Bausch and Lomb Optical Co., 
Rochester, N. Y. 

* Keystone Visual Safety Tests, the Keystone View Co., Meadville, Pennsylvania. 
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218 S. Edgar Wirt 


the acceptability of an applicant and also for discovering employees who 
most likely would benefit from professional ocular attention. Certain 
limits of individual variations in phoria measurements have been incor- 
porated as minimum standards for certain jobs, such as airplane pilots. 
The more general use of such measures proposed by Tiffin for manu- 
facturing industry is as a means of classifying and placing employees 
according to their visual aptitudes for specific jobs even after any mini- 
mum visual requirements for employment have been met. 

Under ordinary conditions of seeing, a normal pair of eyes performs 
a triangulation of the visual axes upon any point of regard—a modifica- 
tion of the posture of the eyeballs (or some substitute mechanism affecting 
the direction of the visual axes; see Park, 4) with respect to each other 
so that the image of the point of regard will be projected in each eye on 
the functional center of the retina. While the eyes are making conjugate 
excursions up, down, right, left, and around, they also vary their relative 
posture—from parallel for distant objects to sharply converging for very 
near objects. 

Under certain test conditions, in which the need for binocular fixation 
and convergence is suspended, a pair of eyes will often assume a posture 
of convergence (or divergence) different from that normally required for 
the particular visual distance of the test. Deviations of convergence, 
under such conditions, from the normal or expected convergence for a 
given distance are called phorias. They are measured in angular units 
(usually tangents) from the normal posture for the test distance, and 
classified as esophoria (greater convergence) and exophoria (less con- 
vergence, or divergence). Ordinarily either of these deviations is not 
measured separately for each eye, since the eyes can alternate in fixation 
and deviation, but as the total deviation in posture of the two visual axes 
from their normal convergence for a given distance. Equal deviations 
in the direction of over- and under-convergence do not have equivalent 
significance for industrial purposes, and therefore deviations must be 
identified as to direction as well as amount. Further, convergence 
posture in a phoria test varies as a function of the focus requirement. 
Phoria measures therefore have significance only in terms of a specific 
focus requirement, or of sharp focus at a specific distance. 

The phenomenon of a phoria is in a sense an artifact, since it occurs, 
by definition, only under certain artificial test conditions. (Deviations 
of the eyes during habitual seeing are not classed as phorias.) The full 
explanation of phorias has not been determined finally, and various expla- 
nations have been proposed. Evidence reported by Tiffin (7) indicates 
that phorias tend to shift consistently with age and with continued 
experience on certain jobs. Possibly they reflect some aspect of visual 
adaptation to certain situations—some aspect of the physiological econ- 
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omy of visual adjustments. Whatever the cause and explanation of 
phorias may be, it is important to measure them among industrial em- 
ployees because they correlate with certain aspects of job performance. 

Subjective methods of phoria measurement are to be preferred to 
objective methods, not only because they are more convenient but also 
because they indicate the angle of convergence of the functional axes of 
vision rather than the anatomical axes of the eyeballs. However, such 
subjective methods require rigid controls that can be stated accurately. 
The stereoscopic method of phoria measurement, early described by 
Wells (8), has seemed the most convenient and “fool-proof’’ method for 
measuring phorias in industries, especially in the hands of lay testers 
who do much of the visual testing in industry for purposes of employee 
classification and placement. To the degree that stereoscope phoria tests 
are reliable and correlate with measures of industrial performance, they 
will be satisfactory for this use in industry. But in order to identify 
cases needing professional attention, stereoscope phoria tests should also 
correlate with corresponding clinical tests at true distances, so that 
anomalies disclosed by the stereoscope in industry can be verified in the 
doctor’s office. 

Insofar as scores on the two tests correlate well with each other 
(i.e., individual scores tend to retain the same relative positions on the 
two scales), scores on one test can be predicted accurately from scores 
on the other test regardless of any difference in scales or difference in 
average convergence on the two tests. 


Phoria Tests in the Ortho-Rater and at a True Distance 


In order to determine the reliability of phoria tests in the Ortho- 
Rater and their correlation with a typical clinical test at a true distance, 
300 college students were tested rapidly on each of seven phoria tests, 
consisting of three different test slides duplicated in two different experi- 
mental models of the Ortho-Rater and a criterion test at a distance of 
six meters. In the Ortho-Rater, adjacent but separate stimulus fields 
are presented to the two eyes (see Figure 1) at the optical equivalent of 
eight meters distance. Each design is made up on a photographic plate, 
mounted in a drum for rapid change with mechanical accuracy in posi- 
tion, transilluminated with a light in the drum. A horizontal scale 
viewed with one eye and a vertical indicator viewed with the other eye 
appear as two elements in the same field, with the indicator pointing to 
some part of the scale. Deviations in convergence of the visual axes 
make the scale and indicator appear in different lateral relation to each 
other, and a reading of the scale directly in line with the indicator gives 
a score that represents a specific convergence posture. Figure 1 shows 
the phoria test designs used in this study, selected from many different 
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Fig. 1.—Some phoria test designs used in the Ortho-Rater. 


designs that had previously been tried. Each has one or more dots over 
the arrow which may be seen superimposed on one or more dots in the 
numbered row. In test slide No. 1 the dot is so small that it did not 
prevent the arrow from “‘floating’’ to different positions during the test, 
sometimes making the test more difficult to score. In test slide No. 2, 
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five dots over the arrow tend to fuse with five consecutive dots in the 
numbered row, holding the arrow steady under one number. There was 
adequate opportunity for individual differences in the position of the 
indicator under the scale, but for any one subject this position became 
more quickly stabilized. The subjects’ responses were somewhat more 
prompt and definite than with the free-floating arrow. Slide 3 is a 
duplicate of slide 2 except that there is only one dot over the arrow, 
involving less fusion with dots in the numbered row. These are an 
adaptation of a design published by Wells (8) and credited by him to 
Javal. Slides 2 and 3 were used in negative form—transparent figures 
on an opaque ground. In order that focus on the test slide would be as 
exact as possible, the numerals were reduced to the smallest size that 
could be seen by practically all subjects, and the checkered arrow was 
introduced to equalize the focus stimulus to the two eyes. All slides 
were calibrated in prism diopters (i.e., multiples of 1/100th the testing 
distance). 

The Ortho-Rater slides are at the optical equivalent of a distance of 
approximately 26 feet—not at the focal length (13 inches) of the lenses. 
Optical centers of the lenses are 80 mm. apart. The experimental model 
B Ortho-Rater was partly enclosed and had a short septum. Model C 
was fully enclosed and had apertures that excluded from view everything 
but the test design, thus reducing the cues by which a subject could know 
or be reminded of the actual nearness of the test and eliminating any 
possible points of fixation (such as might be present on a septum) be- 
tween the test slide and the subject’s eyes. 

For a criterion test a modification of the Stevenson test, consisting 
of a horizontal scale under a vertical arrow in black on a plain, light 
background, was set up at a distance of six meters. The subject viewed 
this chart through prisms, base down before one eye and base up before 
the other. The test chart appeared double, higher to one eye and lower 
to the other, so that the arrow seen by one eye indicated various positions 
on the scale seen by the other eye as the two eyes varied their convergence 
posture relative to each other. According to Sheard (5) this method of 
measurement is an adaptation of an idea of Maddox. 

The modification introduced by Wirt and Reed ‘ consisted of a sys- 
tem of apertures, close enough to the eyes to be quite out of focus, and 
completely hooded, so as not to be included in the apparent structure of 
the test field, arranged so that each eye had a view of nothing but the 
useful area of the test chart. These apertures thus eliminated from view 
all borders of the test chart and other extraneous forms that might 
induce fusion and any objects at intervening positions that might induce 


‘ The author is indebted to Mr. Ray Reed of the Bausch and Lomb Optical Co. 
for his suggestions and help in constructing this test. 
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convergence. Identical focus stimuli are presented to both eyes; this is 
the nearest approximation to the characteristics of the phoria slides used 
in the stereoscope, and it standardized the focus requirement in the test. 
The scale covered a range of 25 prism diopter intervals, numbered con- 
secutively from one end to the other with numerals approximately 
equivalent in legibility to 20/60 Snellen letters—or letters that could be 
read by a “standard eye” at a distance of 60 feet (but more easily recog- 
nizable because they were in normal sequence). The lateral limits of the 
field of view were well beyond the range of numerals. 

Three hundred volunteer subjects solicited at random from a univer- 
sity student population were tested rapidly on all seven tests in three 
hours. Several other volunteers who did not have simultaneous binocu- 
lar vision were quickly discovered and eliminated from the experiment. 
After each 50 subjects the sequence of instruments was changed so that 
each of the six possible sequences was used with 50 subjects. In each 


Table 1 


Differences and Relationships between Phoria Tests in the Ortho-Rater and a 
Criterion Test at a Distance of Six Meters. N = 300. 





Ortho-Rater Model B Ortho-Rater Model C Steven- 
son 
Slidel Slide2 Slide3 Slidel Slide2 Slide3 Test 








Mean convergence 


(prism diopters).... 1.19 1.53 1509 . 1.25 1.44 1.59 .60 
8.D. of convergence 

(prism diopters).... 1.97 2.12 2.06 2.30 2.29 2.39 2.07 
Correlation with 

criterion test....... .78 .78 78 84 .76 80 
Probable error of 

prediction (prism 

diopters).......... 87 88 87 77 .92 83 





group of 50 subjects, 25 saw both sets of Ortho-Rater slides in 1-2-3 
sequence and 25 saw them in 3-2-1 sequence. Each score recorded was 


' the subject’s response to the question, “To what number does the arrow 


point?” Half-scores were recorded occasionally on Slide No. 1 but infre- 
quently on Slides 2 and 3. For calculation all fractions were dropped. 
Subjects did not handle or grasp any part of the testing instruments. 
Table 1 summarizes the results of these tests. Means and standard 
deviations are given in prism diopters of absolute convergence from 
parallel. A significant difference in mean convergence on two different 
tests would indicate a difference in the effective stimulus to convergence 
in the different test situations. Such a difference, if it were consistent 
for all subjects, would not necessarily reduce the correlation between the 
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tests nor the accuracy with which scores on one test could be predicted 
from scores on the other test. A greater standard deviation in one of 
two tests with identical scale units, indicating a difference in the average 
deviation of convergence from the group norm, might reflect the intrusion 
of an extraneous stimulus to convergence. In the absence of any sig- 
nificant difference in mean convergence, a greater standard deviation 
would probably signify less restriction due to peripheral fusional stimuli. 
A reduction in “spread” of scores on any phoria test would tend to limit 
the possible correlation of this test with any other test or with any 
measure of industrial efficiency. 

The mean of the Stevenson test was 0.6 prism diopters of convergence. 
(Since average normal convergence for a distance of six meters is approxi- 
mately one prism diopter, this mean represents an exophoria of 0.4 prism 
diopters.) Means of the Ortho-Rater tests ranged from 1.19 to 1.50 
prism diopters. (Since 0.75 to 0.80 prism diopters would represent the 
convergence required at the corresponding true distance of eight meters, 
these figures represent 0.4 to 0.8 prism diopters of esophoria.) Slide 1 
in each instrument (bright background with almost no fusion element) 
showed from 0.2 to 0.4 prism diopters less convergence than the other 
slides in the same instrument. (A difference of 0.16 between means on 
different slides is significant at the one percent level.) No significant 
difference occurred between identical slides in the two instruments. 
Mean differences of a fraction of a prism diopter would not likely be 
considered important from a clinical standpoint, since phorias ordinarily 
are not measured more precisely than in whole prism diopter units. 

The standard devia_.on of scores on the criterion test was 2.07 prism 
diopters, and on the Ortho-Rater ranged from 1.97 to 2.39 prism diopters. 
(A difference of 0.12 prism diopters is statistically reliable at the one 
per cent level.) Dispersion was consistently, though not reliably, greater 
in Model C, which more completely excluded extraneous stimuli, than 
in Model B, and reliably greater for each slide in Model C than on the 
criterion test with identical units of measurement. Phoria tests in the 
Model C Ortho-Rater would therefore permit somewhat finer classifica- 
tion of phorias and higher correlations with other measures than would 
be possible with the Stevenson test. It is notable that Slide 2 in each 
instrument did not show significantly less dispersion than the other slides, 
although it provided a greater amount of fusional element within the test 
design than either of the others. Its advantages in administration are 
not offset by any sacrifice in dispersion of scores. 

Slide 1 in Model B correlated .87 with the identical slide in Model C, 
Slide 2 in the two models correlated .84, and Slide 3 in the two models 
correlated .87. These intercorrelations between tests in slightly different 
instruments represent conservatively the reliability of the tests. No 
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reliability data were obtained or were available on the Stevenson test. 
Tests in the Ortho-Rater correlated with the Stevenson test (Table 1) 
from .76 to .84, without correction for attenuation due to the lack of 
complete reliability for either test. Scores on the criterion test could be 
predicted from scores on the Ortho-Rater tests with probable errors 
ranging from .77 to .92 prism diopters. (A typical formula for such 
prediction, for test C-3, is K = .70X — .99 + .83.) That is, phoria 
scores on the test at a true distance could be predicted accurately from 
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Fie. 2. Distribution of phoria scores of 300 students in the Ortho-Rater (Slide C-3). 


the Ortho-Rater tests within 0.8 to 0.9 prism diopters in 50% of the 
cases, within 1.0 prism diopter in 55% to 60% of the cases, and within 
2.0 prism diopters in 87% to 91% of the cases. These errors of prediction 
are scarcely greater than the expected error in predicting scores from 
one Ortho-Rater to the other or from first test to second test in the same 
instrument and are probably not great enough to cause any concern from 
a clinical standpoint. For purposes of classification in industry, all these 
tests are practically interchangeable, and the more convenient Ortho- 
Rater tests can well be substituted for a test at a true distance of twenty 
feet or six meters. 

Figure 2 shows the distribution of raw phoria scores for 300 students 
on Slide 3, Model C Ortho-Rater: The curve is essentially a normal 
curve, with some curtailment of the distribution evident in both extreme 
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categories. Probably this test tends to minimize extreme deviations 
from normal by consolidating them into single categories. But the 
tangent scale itself may tend to exaggerate extreme deviations, since 
prism diopter units farther from the perpendicular represent smaller 
angular units of deviation. For purposes of classification in industry 
there is little justification for dividing up these extreme groups more 
finely, and cases at these extremities would certainly be verified as defi- 
nitely abnormal on a clinical phoria test. This tangent scale in prism 
diopter units seems quite suitable for purposes of classification in in- 
dustry. The fact that it provides equal-appearing intervals makes it 
subjectively satisfying and expedites testing. 

As mentioned previously, ‘‘double fatigue’ order of testing was used 
with respect both to slide sequence and instrument sequence. Half the 
subjects saw the slides in both instruments in 1-2-3 order and half in 3-2-1 
order. After every 50 subjects the sequence of instruments was changed 
so that for each possible combination of instrument sequence and each 
alternative slide sequence there were 25 subjects. A summary of phoria 
scores for each of these instrumental arrangements showed no significant 
or systematic difference between first slide, second slide, and last slide 
in either instrument or between first instrument and last instrument on 
corresponding slides. Table 2 shows mean convergence and standard 


Table 2 


Means and Standard Deviations of Convergence on Successive Phoria Tests in the 
Ortho-Rater. (Figures represent prism diopters.) N = 300. 








Mean 8.D. 

First instrument..'............... First slide............ . St. ae 2.23 
Ee ee 1.52 2.21 

Second instrument............... a 1.28 2.15 
I i he ee 1.39 2.25 





deviation of convergence for first and last slides in first and last instru- 
ments. (A difference of 0.3 between means or of 0.2 between standard 
deviations would be significant at the one per cent level. No such 
differences occurred.) Evidently increased experience and familiarity 
with the instrument did not systematically or significantly alter the 
means or standard deviations of the distributions of phoria scores. This 
was not true, however, of a still earlier experimental model of the Ortho- 
Rater, which will be discussed later. 

In a subsequent industrial survey * with Model C, a difference in the 
sequence of the phoria test with respect to other tests in a battery re- 


5 This information was not included in the original thesis. 
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sulted in marked differences in mean scores and standard deviations. 
When the phoria test followed acuity and depth perception tests, which 
required normal and precise triangulation, the mean raw score was 8.4 
with a 8.D. of 1.4 (562 cases). When the phoria test preceded the other 
tests, the mean raw score was 6.7 and the S.D. 2.65 (409 cases); since 
lower raw scores in these tests signify greater convergence, in this second 
instance there was an increase of 1.7 prism diopters in mean convergence 
and an increase of 96% in dispersion of scores. Both these differences 
are reliable and cannot be attributed entirely to chance differences be- 
tween the two samples of the same industrial population. Since the 
second sequence (phoria preceding other tests) corresponds more nearly 
with the conditions of the experiment described above, and since the 
dispersion of scores in this sequence corresponds to that obtained in the 
experiment, where the Ortho-Rater tests correlated satisfactorily with a 
criterion test at a true distance, it would seem most important for valid 
results in industrial situations that the distance lateral phoria test in the 
Ortho-Rater should not be preceded by other tests requiring some specific 
amount of convergence. The greater dispersion by the second arrange- 
ment will permit finer classification of workers and higher correlations 
with criterion tests and with measures of job success in spite of the 
increased mean convergence. 


Differences in Convergence with Different Instruments 


The close correspondence between phoria tests in the Model C Ortho- 
Rater and at a true distance is in contrast to findings from other experi- 
ments with other instruments. A comparison of mean convergence and 
dispersion of scores for groups of subjects with different instrumentation 
of phoria tests reveals some of the factors that must be controlled if 
stereoscopic tests are to be valid. Table 3 summarizes such data for five 
different methods. 

Method I is the Stevenson test at six meters, as modified by Wirt 
and Reed. Mean absolute convergence for 300 college students was 
0.60 prism diopters (normal convergence for six meters is 1.0 prism 
diopters), the standard deviation 2.07 prism diopters. In comparison 
with these figures, results from the four types of stereoscope show a 
progressive increase in convergence and some differences in dispersion. 
The major objection to phoria tests at an artificial optical distance has 
been that knowledge of the actual nearness of the test induces greater 
convergence, as for a distance closer than the one simulated in the 
stereoscope. Evidently this additional stimulus factor is present in vary- 
ing degrees in these different instruments. 

Method II is with the Ortho-Rater, Model C, described before. By 
way of recapitulation, this instrument has lenses of one-third meter 
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(13 inches) focal length with optical centers separated 80 mm. Test 
slides are photographic plates mounted in a drum for rapid change at 
the optical equivalent of eight meters, transilluminated from a light in 
the drum. The instrument is entirely enclosed, and apertures cut out 
all view of anything except the useful area of the test slide. The aper- 
tures are close enough to the eyes to be out of focus and are shaped so 
that they cannot be fused even if they could be seen as part of the test 


Table 3 


Differences in Phoria Tests by Different Methods. (Figures represent prism 
diopters of convergence from parallel visual axes.) 








Method Group N M 8.D. 
I. Stevenson Test...... College students......... 300 0.60 2.07 
II. Ortho-Rater C 
i'n a6 ns ward Same group............. 300 1.25 2.30 
RS Pe cht atscukdaaneaeas 1.59 2.39 
III. Ortho-Rater A 
First Test........ College students......... 103 3.43 2.30 
Second Test....... SE teem thins adie ndenied 3.00 2.21 
First slide........ Cranemen and truckers.... 50 1.39 1.92 
Second slide... .... PR ie ntewbaven ccdates 1.78 1.25 
IV. Telebinocular....... Same group............. 50 6.90 .99 
College students......... 483 7.16 2.80 
Hosiery workers......... 241 7.35 2.45 
| a eee 7046 7.98 2.58 


V. Hand stereoscope 
(Neumueller)...... Clinical patients......... 22 7.66 6.52 





field. Results from two test slides (1 and 3) are reported in Table 2. 
Mean convergence for the same 300 subjects was 1.25 and 1.59 prism 
diopters (normal convergence for eight meters is 0.75 prism diopters), 
with standard deviations of 2.30 and 2.39 prism diopters. The slight 
increase in convergence and dispersion in comparison with Method I 
indicates that an additional stimulus factor, such as a “‘nearness concept,”’ 
may be present to a slight degree. However, the close correspondence 
and high correlation of scores between these two methods, as shown in 
Table 1, makes it feasible and practicable to substitute one for the other 
for purposes of classification in industry. 

Method III is an earlier model of the Ortho-Rater which was not 
enclosed and had a full length septum instead of apertures. The end of 
this septum seemed to provide a stimulus to convergence. Mean con- 
vergence for 103 college students was 3.43 and 3.00 prism diopters on 
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two occasions with the same test slide—significantly greater than for the 
similar group in Method II. This difference represents part of the im- 
provement in instrumentation of the later over the earlier experimental 
model Ortho-Rater. Test and retest, with an average interval of two 
days, in Model A on these students correlated .74 and .85 for two different 
test slides. The second test, and the second slide on each test, showed 
consistently (though not reliably with this number of subjects) less con- 
vergence. Such difference, probably attributable to increased ability to 
disregard the actual nearness of the tests, could not be discovered with 
Models B and C. Possibly this reflects the difference in stimulus on 
initial tests between earlier and later models, the later model having 
largely eliminated the extraneous stimulus to convergence. 

For 50 cranemen and truckers in an aluminum extrusion plant, 
Method III with two different slides gave mean convergence of 1.39 and 
1.78 prism diopters and dispersion of 1.92 and 1.25 prism diopters—all 
significantly less than for college students tested by the same method. 
This difference might be interpreted as indicating greater susceptibility 
to the convergence stimulus of the “‘nearness concept’’ on the part of 
college students. The fact that they showed less convergence on repeated 
tests indicates that this extraneous factor accounted for some of their 
initial convergence. Also the fact that a similar group of students on 
Methods I and II did not show so much convergence indicates that the 
higher convergence on first test with Method III was somewhat spurious 
and did not quite represent what they would have shown at a true 
distance. 

Method IV was the Keystone * Telebinocular, with various standard 
and specially constructed slides that did not differ essentially among 
themselves in comparative tests with any single group. The Telebinocu- 
lar has lenses of one-fifth meter (8 inches) focal length, and those used 
in the present study varied in optical separation of lenses from 97 to 
98mm. (Calculations are based on a separation of 97 mm., which would 
tend to minimize very slightly the convergence as here reported.) Test 
slides are photographic paper prints, set at the optical equivalent of 
infinite distance (which would further minimize convergence in compari- 
son with the previous methods, since normal convergence for infinity is 
zero), and interchanged manually. The Telebinocular has gross aper- 
tures and shielding for the apertures, but no enclosure oi the test slides. 
Change of slides can be observed by the subject at the actual distance of 
eight inches. The standard slide (DB-9) is calibrated and numbered in 
2% prism diopter units with half-units also marked. Distribution curves 
of scores on this test slide show that the numbered intervals receive more 


* Keystone Visual Safety Tests, the Keystone View Co., Meadville, Pennsylvania. 
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scores in proportion than the mid-points, with the result that curves are 
saw-toothed in appearance rather than smooth. 

On the same 50 truckers and cranemen reported under Method III, 
this Method IV showed mean convergence of 6.90 prism diopters and a 
dispersion of 0.99 prism diopters. The difference in optical distance 
would call for 1.0 prism diopter less convergence by Method IV than by 
Method I and 0.75 prism diopters less than by Methods II and III, but 
the convergence shown by this group was markedly and significantly 
higher. A group of 483 college students showed still greater convergence 
(7.16 P.D.) and much greater dispersion (S.D. = 2.80 P.D.). A group 
of 241 employees doing very close work in a hosiery mill and a group of 
7046 employees on various production and maintenance jobs in a steel 
mill showed also high convergence on the Telebinocular test. Inter- 
correlations between Keystone phoria tests and tests at a comparable 
true distance were not included in the present study. Fry (1) reported 
raw data that showed a correlation of only .40 between phoria tests in 
the Telebinocular and also at a true distance of six meters on 86 elemen- 
tary school children. Fry reported (for 85 cases) mean convergence of 
8.2 prism diopters on the Telebinocular and 2.0 (1.0 prism diopters 
esophoria) on the six meter test. 

Evidently a markedly higher convergence and slightly greater dis- 
persion are characteristic of most tests in the Telebinocular, and this 
must be due to the extraneous stimulus factor related to knowledge of 
the actual nearness of the test slide. Such a factor may account in part 
for the low correlation between this phoria test and one at a true dis- 
tance, and may also be expected to reduce correlations between this test 
and job performance at a corresponding true distance. 

Method V presents computations based on results reported by Neu- 
mueller (3) for 22 clinical patients with an ordinary hand viewing stereo- 
scope. Neumueller’s test slide was at the focal length (eight inches) of 
the stereoscope, which had precision lenses replacing the commercial 
lenses in the instrument, a short septum, but no apertures, shielding, or 
other enclosure of the test slide. Presumably it was held by the subject, 
but the tester alone manipulated the slide. Computations from Neu- 
mueller’s data showed a mean convergence of 7.66 prism diopters with 
a standard deviation of 6.52—both greater than for most other groups 
investigated in this study by other methods. On a criterion test at 20 
feet (presumably like that described by Lesser, 2) Neumueller’s group 
showed a mean convergence of —0.07 prism diopters,’ and a standard 

’ This figure involves a correction of 1.0 prism diopter due to the fact that Neu- 
mueller’s measurements at 20 feet were in terms of deviation from the normal for that 


distance, which is approximately 1.0 prism diopter convergence, while his measurements 
in the stereoscope were based on deviations from parallel visual axes. This correction 
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deviation of 1.66 (corrected for the small number of cases). This lack 
of convergence and small dispersion on the criterion test indicates that 
the high convergence and dispersion in the hand stereoscope are not 
attributable to a characteristic of this group in contrast with the other 
groups reported. The extreme difference is probably due to the factors 
in stereoscope construction mentioned above, and also possibly to a 
kinesthetic cue of near distance if the subject held the stereoscope in his 
own hand. The correlation between Neumueller’s stereoscope tests and 
true distance tests was only .07. In this situation scores on one test 
could not be predicted from scores on the other test regardless of any 
agreement or disagreement between mean scores on the two tests. 


Summary 


It would seem that phoria measures in a stereoscope have more or less 
in common with measures at a true distance according to certain charac- 
teristics of the instruments. In the Bausch and Lomb Ortho-Rater, a 
13-inch stereoscope, with careful exclusion of extraneous stimuli and with 
a slide design demanding focus at the optical equivalent of a finite dis- 
tance, phoria measures corresponded quite closely with measures at 20 
feet (6 meters), distributions of scores by the two methods were similar, 
correlations between scores by the two methods were reasonably high, 
and the reliability of the stereoscope tests was reasonably high. Changes 
due to increased instrument experience had been eliminated in this model. 
At the other extreme is the hand stereoscope in which phoria measures 
seemed very erratic and correlated practically zero with those at a true 
distance. 

These characteristics of the less satisfactory stereoscopes are probably 
largely responsible for the difference: 


1. Shorter focal length lenses and wider separation of lenses. 

2. Fusional elements in the field of view which restrict deviations. 

3. Elements stimulating convergence (such as parts of a septum). 

4. Obvious nearness of the test slide, without enclosure to facilitate 
the illusion of a greater distance. 

5. Kinesthetic cues of nearness from grasping the instrument, or cues 
from seeing the tester’s manipulation of the test slide. 

6. Insufficient stimulus to effect a definite focus posture appropriate 
to the optical distance of the test. ; 

7. Test designs permitting too great fluctuations of the position of 
the indicator during a test. 


reduces by 1.0 prism diopter the mean difference between methods as indicated in 
Neumueller’s raw scores. The same correction was made on Fry’s data. 
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Differences in results by different methods do not per se imply greater 
validity for any one method with respect to industrial criteria of job 
performance. Whichever method is most successful in predicting em- 
ployee success on the job is the one that should be established as a criterion 
for other phoria tests if they are to be used for such purposes of prediction. 
In evaluation of the validity of a phoria test, correspondence of mean 
convergence in two tests may be desirable as indicating similarity in the 
stimulus fields of the two tests. But the chief criterion is the degree of 
correlation between the two tests and the accuracy of scores predicted 
for one method based on scores obtained by the other method. 

Several pertinent implications can be stated at this point: 


1. It is reasonable to assume that the phoria tests which will correlate 
best with performance at actual working distance are those made by 
methods that involve true distances or that correlate highly with such 
methods. Stereoscope phoria tests at the optical equivalent of infinite 
distance have no correlate in actual conditions of seeing and no doubt 
can be replaced to advantage with tests at the optical equivalent of a 
finite distance. 

2. If the characteristic difference of phorias induced by. the “nearness 
concept”’ is of itself related to performance on certain industrial jobs, that 
factor should be measured separately. 

3. Phoria measures for industrial use that correlate well with stand- 
ard clinical phoria tests will facilitate adequate clinical rehabilitation of 
industrial employees who are referred for professional eye care. How- 
ever, standard clinical phoria tests might also be critically investigated 
with respect to their differences, standardization, reliability, intercorre- 
lations, and relations with industrial criteria. 

4. It is meaningless to try to evaluate phoria measurements in ‘‘the 
stereoscope” or “a stereoscope” without restriction to specific instru- 
ments. In fact, probably no two methods of phoria testing will yield 
identical results; nor will one method yield identical scores for the same 
subjects on two separate occasions. Any phoria test findings must there- 
fore be stated as obtained by a specific method and with reference to 
norms obtained by that method. 

5. For phoria tests showing a standard deviation of 2.0 to 2.5 prism 
diopters, a tangent scale of 15 intervals of 1.0 prism diopter each would 
seem to be adequate and suitable for purposes of classification in industry. 


Received April 16, 1943. 
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Measures of Potentiality for Machine Calculation * 


Robert M. Gottsdanker 
The George Washington University 


Little has been done as yet toward developing measures of potential 
ability in office-machine work. Because of the ever-increasing impor- 
tance of work of this kind, such measures are necessary. The present 
study is concerned with predicting proficiency in the operation of one of 
the most widely used types of office machine: the crank-driven (or key- 
setting) calculator. 

Attempts have been made in the past to construct tests of aptitude 
for office work. One example is the Minnesota Vocational Test for 
Clerical Workers (3) ; another is L. L. Thurstone’s Examination in Clerical 
Work (5). These tests, as their names indicate, purport to measure 
aptitude in the general field of clerical work; whether they measure apti- 
tude for operating any specific office machine can be known only after 
experimental study. 

The only previous studies published in which tests were validated 
against a criterion of ability to operate the crank-driven calculator were 
done by the United States Employment Service (4). In the two studies 
reported no mention is made of any attempt to construct special tests 
for this kind of work on the basis of an analysis of necessary worker 
characteristics. Instead, it is evident that the tests used were developed 
for more general use. Neither study provided a test battery adequate 
for prediction. In one study a multiple correlation coefficient of .69 was 
obtained between performance scores and scores made on a group of 
aptitude tests. However, this battery consisting of eight tests was 
acknowledged to be too long for practical use. Furthermore, since the 
subjects were all experienced workers at the time the tests were admin- 
istered, prediction of future proficiency scores of persons yet without 
training on the basis of results obtained on this sample is subject to the 
possibility of serious error in that the correlations may have been influ- 
enced by differential experience. The number of cases was too small in 
the other study to yield regression coefficients sufficiently reliable for 
accurate prediction. 

*This paper summarizes a dissertation submitted for the degree of Doctor of 
Philosophy at the University of California. The writer is deeply grateful to Professor 
Edwin E. Ghiselli for his constant aid and encouragement while directing this research. 


A brief report of the investigations was presented at the meetings of the American 
Psychological Association, September 4, 1941. 
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The present investigation was undertaken in an effort to provide 
information regarding aptitude for crank-driven machine calculation 
which had not been forthcoming from any previous studies. More 
specifically, it was the purpose of this study to seek measures of the 
abilities needed for proficient machine computation and to construct a 
test which would predict this proficiency. The method selected for 
accomplishing this end was as follows: first, to analyze the job and to 
select tests which appeared to measure the psychological characteristics 
needed for satisfactory work; second, to validate the tests selected by 
correlating scores made by subjects prior to training in machine compu- 
tation with criterion measures based on work done in a training course; 
and finally, to weight these tests so as to obtain the best possible predic- 
tion of the criterion measures. 


Job Analysis 


There are two main types of calculating machine: the key-driven in 
which numbers are registered as soon as the keys are depressed and the 
crank-driven or key-setting in which setting the numbers and registering 
them are separate operations. Key-driven calculators are of value mainly 
for addition while the crank-driven calculators, although slower at addi- 
tion, can be used to better advantage in the other arithmetic operations. 
In this study we are concerned only with the crank-driven type, of which 
the best known American representatives are the Marchant, the Monroe, 
and the Friden. Therefore, the definition of the duties of a calculating- 
machine operator given in the Dictionary of Occupational Titles (1) may 
be modified to the following form: 


First manipulates keys or levers to set machine in position for the kind of 
operation to be performed. Next, presses proper keys on number keyboard 
to set number into the machine. Finally, manipulates certain keys or levers 
to perform the desired computation. 


Certain of the requisite traits for successful performance of the job 
of calculating-machine operator can be inferred from the foregoing defini- 
tion. However, as Viteles (7) has pointed out, knowledge of the nature 
of the work involved in a job is only one source of information relative 
to the characteristics required for its successful performance. Conse- 
quently, the following three sources were also employed: school manuals 
which state specifically the methods used for the various computations, 
the observation of others operating machines, and the writer’s own experi- 
ence in operating a calculating-machine over a period of several years. 

Experience has shown that in attempting to note the traits which are 
necessary for a job it is easy to overlook ones which may be important. 
Consequently, lists of traits which might be used to describe workers on 
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practically all jobs have been prepared (8). In making the specifications 
for a given job the analyst considers each of the traits and estimates its 
relative importance for the job. The list of worker characteristics de- 
veloped by the United States Employment Service (4) was used in the 
present investigation. This form contains 47 different trait names, each 
of which is to be rated A, B, or C. An A rating indicates a very great 
amount of the trait is needed, such as would be possessed by not more 
than two per cent of the population. B indicates a distinctly above- 
average amount of the trait is necessary, less than that designated by A 
but more than that designated by C. C indicates an amount of the 
trait less than that possessed by the highest 30 per cent of the general 
population. 

In this study no A ratings were made; three of the characteristics 
named on the form received B ratings (eye-hand coordination, memory 
for details, and arithmetic computation); the other 44 traits were rated 
as C. Two new characteristics were added and given ratings of B. 

The reasons for the B ratings were: 


Eye-hand coordination: In order to do satisfactory work the operator must 
be quick and accurate in pressing the keys and in manipulating the levers at 
which he is looking. 

Memory for details: The operator must remember what actions have to 
be performed for each kind of calculation as well as the location of the appro- 
priate keys and levers. He also must be able to remember momentarily the 
numbers on his work sheet in order to put them into the machine and the 
answer appearing in the dials in order to transcribe it correctly. 

Arithmetic computation: This term is used here in a rather general sense 
and might better be called number facility, corresponding to Thurstone’s 
factor N (6). Many of the more complex machine operations can be mastered 
with relative ease if the individual knows how they are related to the simpler 
processes. Also, it occasionally happens that the knowledge of approximate 
answers will save time. 

Perception of details: This characteristic was meant to include much of 
what Thurstone calls factor P. In machine work it is often necessary to 
select certain figures from a whole page of figures. Also, numbers must be 
read correctly both in working the machine and in copying answers. 

Perception of spatial relations: This characteristic was considered as corre- 
sponding in part to Thurstone’s factor 8, which represents the ability to per- 
ceive vastive positions of things. In calculating-machine work a kind of 
spatial ability is needed in putting numbers into a keyboard of nine rows and 
jens . more columns. This ability is also needed for locating the other keys 
and levers. 


Development of Tests 


Suggestions as to the kinds of tests to include in the battery were 
obtained from the “necessary worker characteristics” indicated in the 
foregoing job analysis and from the tests found particularly useful in the 
studies of the United States Employment Service. The tests which 
showed validity coefficients of over .40 in the study having the larger 
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group of subjects were tracing and pursuit items from the MacQuarrie 
Test for Mechanical Aptitude, letter-digit substitution, and arithmetic. 

The nature of the tests selected for the present study was further 
determined by the requirements that they be capable of furnishing ade- 
quate individual differences, and of being objectively and conveniently 
administered and scored. The tests devised were tried on preliminary 
subjects to make certain that these conditions were met. 

Nine pencil-and-paper time-limit tests were chosen according to the 
foregoing considerations. In administering one of the tests, the first step 
was to read aivud instructions which were printed on the subjects’ test 
blanks. These directions were a bare description of what was to be done. 
Next, the examiner read aloud additional instructions which had been 
found helpful in clearing up misunderstandings. A short example of 
completed items was included for further clarification. Next, the subject 
was given a short practice test. Finally, the test itself was administered. 

Samples selected from the practice tests and examples are shown in 
Figure 1. In order of administration the tests were: 


Number Finding: The test consisted of two full pages of randomly- 
placed digits such as are shown in the sample strip. The subject was 
instructed to circle every number 4 on the first page and every number 8 
on the second page. This adaptation of Taylor’s number test was in- 
cluded as a measure of perception of details. The time limit for each 
page was 20 seconds. 

Tapping: Two of the 25 rows of circles contained in the test are shown 
in the sample. The subject was instructed to put a dot in each circle. 
The MacQuarrie dotting and tapping items are similar to this test but 
have much shorter time limits. The test was considered a measure of 
eye-hand coordination. The time limit was 1 minute and 30 seconds. 

Number Tracing: The sample shows the initial portion of a maze 
similar to the one that formed the test. The subject was required to 
trace the correct pattern of 3’s in going down the page. A similar task 
was used by Wolfie in a learning experiment (9). The test was included 
as a measure of perception of details. The time limit for trial one was 
45 seconds and that for trial two, 40 seconds. The test was thought 
similar, in terms of the characteristics required, to the MacQuarrie pursuit 
and tracing items, both of which gave high validity coefficients in the 
study done by the United States Employment Service. 

Choice Dotting: The subject was instructed to dot certain of the 
circles of each cluster of circles according to a plan which depended on 
the letter in the center of the cluster and the letter in the previous cluster. 
Four clusters similar to the 28 contained in the test are shown in the 
sample. The test was original and was constructed in order to measure 
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functions needed for performing quickly the correct movement at a 
moment when several movements are possible—a kind of “memory for 
details.” The time limit was 1 minute and 10 seconds. 

Number-Dot Location: The subject was directed to circle dots in the 
three columns of each block of dots according to the numbers printed 
below the block. The bottom dots in a block represented 100, 10, and 1, 
from left to right; the top dots represented 900, 90, and 9; the inter- 
mediate dots represented the intervening values. Six such rows of blocks 
as are shown in the sample were included in the test. This test also is 
an original one and was intended to measure perception of spatial rela- 
tions. The task corresponds rather closely to that of putting numbers 
into the keyboard of a calculator. The time limit was 2 minutes. 

Digit-Letter Substitution: Nine rows of letters were given in the test. 
The subject was allowed to refer to the digit-letter code (not shown in the 
sample) at any time. This test, which is similar to many which have 
been used in the past, was intended to measure certain aspects of memory 
for details. A similar test showed a high validity coefficient in the 
United States Employment Service’s study. The time limit here was 
1 minute and 15 seconds. 

Number Comparison: This test was adapted from the Minnesota Voca- 
tional Test for Clerical Workers (3). The subject was instructed to put 
a plus mark between identical pairs and a minus sign between pairs 
which differed. The test was intended to measure perception of details 
and immediate memory for details. There was a time limit of 6 minutes 
for this test which consisted of 200 pairs of numbers. 

Letter Matching: On one page groups of letters were printed, each 
preceded by the order number of the group from the top of the page. 
On the facing page there were columns headed by each of the letters used 
in the groups. In the sample shown here the groups of letters and the 
columns are, for convenience, presented adjacent to each other. Thirty 
groups of letters were included in the test. The task was to copy the 
number which was in front of each group onto the opposite page in the 
columns headed by each of the letters in the group. The test was de- 
signed to measure immediate memory for details. The time limit was 
2 minutes. 

Arithmetic Computations: This test obviously was included to measure 
ability in arithmetic computation. A high validity coefficient was found 
for arithmetic items in the study performed by the United States Employ- 
ment Service. Fifty problems were included in this test which had a 
time limit of 5 minutes. 

Except for those tests in which a slipshod performance could result 
in a deceptively high score, test scores were simply the number of items 
done correctly. Tests in which a careless performance might result in a 
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large number of correct items are those in which the time needed for 
marking an item is far less than the time required for accurate judgment. 
They are also tests in which each item presents only a few alternatives 
to choose among. In the tests where these conditions existed errors were 
subtracted from the number of correct items. Omissions were subtracted 
on those tests in which the items were possibly of unequal difficulty for 
the subject—thus preventing him from obtaining an inordinately high 
score by skipping over the more difficult items. Following are the for- 
mulae used for determining scores on the various tests: 


Number Finding, the number of items circled correctly; 

Tapping, the number of circles dotted; 

Number Tracing, the number of maze units negotiated correctly; 

Choice Dotting, the number of circles which should have been dotted 
in the clusters covered minus two points for any cluster on which a 
mistake was made and three points for any cluster skipped; 

Number-Dot Location, the number of columns in which the correct 
dot (and no other dot) was circled; 

Digit-Letter Substitution, the number of correct substitutions minus 
the number of items skipped; 

Number Comparison, the number of correct markings minus the num- 
ber of errors and omissions; 

Letter Matching, the number of correct entries minus the number of 
blanks skipped; 

Arithmetic Computations, the number of problems done correctly 
minus the number of omissions. 


Experimental Subjects and Procedures 


Validity group. The battery of tests was administered at the begin- 
ning of a 12-week quarter to 51 women enrolled in the class in office 
practice at Armstrong Business College in Berkeley, California.' One 
part of the course concerned the operation of the crank-driven calculator. 
Since 8 of the students did not continue course work long enough to take 
a proficiency test, the sample was reduced to 44 cases. This sample, 
called the validity group, although not as large as would be desired, 
presented several advantages for a well-controlled study. First, none of 
the group had had previous experience in operating the crank-driven 
calculator. Second, as will be explained later, all had the same amount 
of experience at the time of taking each of the proficiency tests. Third, 
there were no obvious sources of excessive heterogeneity. 

Reliability group. Because it was not feasible to obtain the validity 
group for a second administration of the battery, a second group of 


1 The writer is indebted to Mr. L. L. Deal for making available this group of sub- 
jects and for furnishing criterion data. 
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subjects was chosen for the purpose of obtaining estimates of the test 
reliabilities. It included the women taking a course in tests and meas- 
urements at the University of California.2 The battery was adminis- 
tered to this group on two occasions, the second administration following 
the first by two weeks. Forty-nine women took the battery of tests both 
times. This sample was called the reliability group. Only a few mem- 
bers of the group had ever operated a calculator, and those on but a few 
occasions. 


Criterion of Proficiency 


Every sixth assignment in the work book in crank-driven calculator 
operation used in the office practice course was a 20-minute test (2). 
The first test included 35 problems, the second test 30 problems, the 
third test 33 problems, and the fourth test 21 problems. The problems 
in the first test were fairly simple ones in addition, subtraction and multi- 
plication; the other test included more complicated problems based on 
those operations as well as problems involving division. These tests 
were used as criterion measures. 

Four models of calculating machine were used by the subjects in 
taking the tests: the Marchant Model D; the Marchant Model M; the 
Monroe Model MA 7; and the Monroe hand-operated machine. The first 
three of these machines are fairly equal in speed of operation since all 
have an electrically-driven crank mechanism, electric carriage shift, auto- 
matic division, and electric clearance. The fourth type, being completely 
hand powered, cannot ordinarily be operated as quickly as the other three. 

Because of the difference in operating speed between the hand and 
electric machines it was necessary to adjust the criterion test scores for 
the individuals taking tests on the hand machine. The number of 
problems done correctly minus the number done incorrectly was used as 
the score on a criterion test for a subject using an electric machine. For 
one using/a hand machine, the right-minus-wrong score was multiplied 
by the ratio of the mean number of problems done by subjects using the 
electric machines to the mean number done by those using the hand 
machine. These scores were combined into a single distribution of 
standard scores for each of the four criterion tests. Since the whole 
group of subjects did not take all four test during the quarter, the mean 
standard score was obtained for each subject for all of the tests he had 
taken. These mean scores were transmuted to scores in a distribution 
having a range of values of from 1 to 13. The mean of this distribution 
was 6.95 and the standard deviation was 3.04. Table 1 gives, for each 
criterion test, the number of individuals using electric and hand machines, 


2 The cooperation of Professor C. W. Brown in making available this group of 
subjects is appreciated by the writer. 
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Table 1 


Mean scores on criterion tests 





Mean Adjust- 
Criterion Machine Problems ing* 
Test Used Attempted Ratio Meanr-_w cR-W 





electric 
hand 


21.9 
17.6 1.24 15.6 4.7 


18.2 
16.6 1.10 14.4 3.0 


19.0 
15.3 1.24 15.4 2.8 


17.3 
17.0 1.02 15.4 2.0 
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electric 
hand 
electric 
hand 


electric 
hand 


J oa8 


ee 
on ow 





mean problems on electric machines 
mean problems on hand machines ~ 





* Adjusting ratio = 


the mean number of problems done by these two groups, the ratio be- 
tween the two means, the mean right-minus-wrong score after hand- 
machine scores were multiplied by the ratio, and the standard deviation 
of the final adjusted distribution. Table 2 gives the correlation coeffi- 


cient between each pair of the criterion tests for those subjects taking 
both tests of the pair. 


Table 2 


Intercorrelations between criterion tests 





Criterion Test 2 3 4 





1 60 45 13 
2 84 24 
3 38 





The criterion measures were subject to several sources of error. The 
problems were probably not of equal difficulty nor the errors made of 
equal seriousness. The three types of electric machine might actually 
have been slightly different in ease of operation. The ratios used to 
adjust the right-minus-wrong scores for subjects using hand machines 
were based upon means obtained from very small groups for some of the 
tests and thus are of doubtful reliability. The mean criterion scores 
were not equally dependable for all subjects since they were based on 
different numbers of tests. Indeed, it was not possible to obtain an 
estimate of the reliability of the final criterion scores. 

However, there were a number of important advantages in using 
these tests as criterion measures. The problems used, being like those 
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in the other assignments, were meaningful in determining how well the 
subject had learned calculator work as taught in the course. Also, be- 
cause the subjects knew that the tests were used in arriving at course 
grades, they were highly motivated to perform as well as they could. 
Using the performances on several types of machine by equating the 
electric machines on a priori grounds and employing empirical ratios for 
adjusting hand-machine scores broadened the field in which the results of 
this study might be used for prediction. Finally, using scores on all the 
proficiency tests, rather than scores on only those tests which all indi- 
viduals had taken, maintained the size of the sample and made the final 
criterion scores dependent upon the largest number of measures available. 


Results 


The means and standard deviations of scores made by the reliability 
group on both administrations of the battery and by the validity group 
are presented in Table 3. One of the subjects in the reliability group 














Table 3 
Means and standard deviations of test scores 
Reliability Group 
First Second Validity 
Administration Administration Group 
Test M a M o M o 
Number Finding................. 69.8 74 74.5 8.7 68.0 7.4 
i.e ek ia RR eas ts he 223.9 29.7 225.1 25.9 224.5 21.0 
Number Tracing................. 18.2 7.0 25.6 8.6 17.1 6.1 
Ee Te 29.3 16.1 40.9 16.6 15.2 8.1 
Number-Dot Location............ 90.7 15.3 102.0 17.1 85.8 14.0 
Digit-Letter Substitution.......... 489 113 57.3 13.6 48.4 8.4 
Number Comparison............. 102.6 23.0 108.7 248 102.1 17.2 
Letter Matching................. 54.5 108 60.7 9.9 613 119 
Arithmetic Computations......... 23.1 9.1 23.1 8.8 214 8.5 





and two in the validity group were unable to understand the instructions 
for the Letter Matching test and so were not included in these values. 
It is to be noted that the mean score of the validity group was lower 
than that of the reliability group on the first administration on every 
test except Tapping, an especially large mean difference being found on 
the Choice Dotting test. Also, in seven of the nine tests, the validity 
group had a smaller standard deviation than did the reliability group. 
For every test except Arithmetic Computations, there was an increase in 
the mean score of the reliability group on the retest. 
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In Table 4 are entered the intercorrelations between the tests of the 
battery. Letter Matching is not included because of the few subjects 
who were not able to start the test. The upper figure in each cell is the 
correlation coefficient for the validity group and the lower figure that for 
the reliability group on the first administration of the test. It is seen 
that the correlation coefficients tend to be higher for the reliability group. 
While the validity group’s highest coefficient was .32, the reliability 


Table 4 
Intercorrelations between tests* 


NT CD NDL DLS NC 








13 04 12 02 17 
03 05 40 32 


00 08 23 16 .20 
13 — .03 14 — .09 


06 . 08 08 
42 ‘ 21 07 


07 07 
39 59 


22 27 
51 


DLS 


NC 





* NF—Number Finding. NDL—Number-Dot Location. 
T —Tapping. DLS —Digit-Letter Substitution. 
NT—Number Tracing. NC —Number Comparison. 
CD—Choice Dotting. AC —Arithmetic Computations. 


The upper value in each cell refers to the validity group; the lower value refers to 
the reliability group. 


group had eight of .50 or over. There were no large negative values, 
the highest of the four in the entire table being —.14. There was general 
agreement in order of magnitude between corresponding coefficients for 
the two groups. For example, the highest coefficient obtained for both 
groups was that between Number Comparison and Arithmetic Compu- 
tations, .32 for the validity group and .62 for the reliability group. 
However, while there were no high coefficients in the validity group for 
either Choice Dotting or Digit-Letter Substitution, the highest being .22, 
in the reliability group there were five coefficients of .50 or greater for 
these tests. For Choice Dotting this discrepancy can be explained largely 
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in terms of the difference in the variability of scores. The general differ- 
ence in the size of intercorrelations may well have been due to the lower 
variability of the validity group. 

The test-retest reliability coefficients are presented in the left-hand 
side of Table 5. Especially high values were found for the Arithmetic 


Table 5 
Reliability and validity coefficients for the tests 





Test T12 


IRs 6's 6 Sv aea'y cd bpancne coeehs .69 
Sah Madde sv aupoouhscaelawed none .68 
Pcs snow eb ask eouwess ahtwais 67 
Ea iis o's ns is pA Re eeeee 46 Heh 80 
Number-Dot Location. .................... 81 
Digit-Letter Substitution. .................. .79 
Number Comparison.................-.005- 91 
REIS ee PR ene ees are 55 
Arithmetic Computations.................. 92 





S5SRRRRBR) 3 





Computations and Number Comparison tests, both being over .90. All 
of the other tests (except Letter Matching) gave reliability coefficients 


quite satisfactory for tests with such short time limits. 

The coefficients of correlation between scores on each of the tests 
(except Letter Matching) and the final criterion scores are given in the 
right-hand side of Table 5. The highest validity coefficient was that of 
.49 for the Number-Dot Location test. The second highest coefficient 
was that of .36 for the Arithmetic Computations test, and the third 
highest that of .29 for the Number Comparison test. Although the 
other five values were positive, none was as much as twice the size of its 
standard error. 


Selection of the Final Test Battery 


The combination of predictor items by multiple correlation methods 
generally gives a spuriously high index of the forecasting efficiency of 
the combination. Increasing the number of items in a combination ordi- 
narily increases the predictive efficiency, but this increase becomes less 
and less while the chance error continues to increase. Wherry has de- 
vised a method which corrects the coefficient of multiple correlation for 
this error and at the same time indicates the best combination of tests (4). 

This technique, called the Wherry-Doolittle Test Selection Method, 
was employed to find the final test battery. The method involved 
selecting the test with the highest correlation with the criterion as the 
first test in the battery and then including in succession the tests which 
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increased the multiple correlation coefficient the most. | Table 6 gives 
the size of the multiple correlation coefficient resulting from the addition 
of each test to the battery. To the right of this value, in the column 
headed R, is the estimated population correlation as determined by the 
Wherry shrinkage formula. It is seen that the use of only the test with 
the highest validity coefficient, Number-Dot Location, resulted in a value 
of .49. When the Arithmetic Computations test was added a multiple 
correlation coefficient of .55 was obtained, the shrunken value being .53. 


Table 6 


Coefficient of multiple correlation between criterion and successively enlarged 
test battery * 





Test R 





Number-Dot Location. ............sccccescccees AQ = zero-order r = . 
Arithmetic Computations. ...................... 55 
EE RE iit has od ini ccwdindeabgedse+ 57 





EL Chee ee lois ccodsgewsbesseve’ 58 
Boeeeber Beato... oc. ww cece e tees 59 
EE .60 
ehh ale SE a's Wa on'e $0ene nw ae mes .60 
ee cen i neheaiie desia tasaneaene? .60 





* R—multiple correlation coefficient. 
R—shrunken multiple correlation coefficient. 
The heavy line has been placed under the last test included in the final battery. 


Upon inclusion of the Tapping test, the multiple correlation coefficient 
increased to .57 and the estimate based on the shrinkage formula to .54. 
Addition of the Choice Dotting test increased the coefficient of multiple 
correlation to .58, but at this point the estimated population correlation 
decreased to .53. R increased to .60 upon successive inclusion of the 
Digit-Letter Substitution and Number Comparison tests, but did not 
increase further (in the second decimal place) when Number Finding and 
Number Tracing were added. The corrected value, on the other hand, 
decreased with each of these inclusions. 

The heavy line under the Tapping test in Table 6 indicates that 
Tapping was the last test which increased the shrunken correlation 
coefficient, and according to the Wherry-Doolittle method was the last 
test to include in the final test battery. The Number-Dot Location, 
Arithmetic Computations, and Tapping tests were thus indicated as the 
final test battery although the addition of three other tests increased the 
unadjusted coefficient of multiple correlation from .57 to .60. 

The standard score form of the multiple regression equation, using 
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c to indicate the criterion, ndl for the Number-Dot Location test, ac for 
the Arithmetic Computations test, and ¢ for the Tapping test, was: 


Zc = 39Zndl + -25Lee + 152; 


The raw score form of the multiple regression equation using the 
distribution of criterion scores having a mean of 6.95 and a standard 
deviation of 3.04 was: 


Xe- = .O86Xna + .090Xae + .021X; — 7.06 


The standard error of estimate of the equation was 2.49 in raw score 
terms. The standard error of the multiple correlation coefficient of .57 
was .11. 

The correlation coefficients between scores predicted by the multiple 
regression equation and scores made on the separate criterion tests were 
.55 with the first criterion test, .67 with the second criterion test, .62 
with the third criterion test, and .36 with the fourth criterion test. 
These coefficients provide a partial check on the generality of the pre- 
dictive formula. It is of interest that the coefficients for the second and 
third criterion tests were higher than that for the first test (and even 
for the combination) even though the first test was the only one repre- 
sented on every subject’s criterion score. It is obvious that far higher 
validity coefficients could be obtained for the second and third tests if 
the tests in the battery were weighted to maximize these values. 

It is probable that all of the individual validity coefficients and the 
multiple validity coefficient would have been larger had there been more 
variability in test scores. This factor was illustrated in the lower inter- 
correlations between tests for the validity group than for the reliability 
group. Also, the factors previously discussed which made for unrelia- 
bility of the criterion quite possibly acted to lower the validity coefficients. 
The coefficient of multiple correlation of .57 may thus be somewhat of an 
understatement rather than an overstatement of the relation existing in 
the general population of individuals learning to operate crank-driven 
calculators. However, even the shrunken coefficient of .54 may be 
considered substantial for a battery requiring a total of but eight and a 
half minutes of test performance. 

When the scores made by the reliability group on the three tests 
selected for the final battery were weighted according to the regression 
equation, the test-retest reliability coefficient was found to be .89. 


Summary 
A battery of nine pencil-and-paper tests was constructed in an at- 
tempt to obtain measures of abilities needed for proficiency in the 
operation of a crank-driven calculator. The selection of tests was based 
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upon an analysis of the job in terms of the worker characteristics which 
appeared necessary and upon results found in previous studies. 

The scores made on these tests by a group of 44 women learning to 
operate the crank-driven calculator were correlated with criterion meas- 
ures based upon examinations in calculator work taken during the training 
course. The three tests showing the highest correlations with the cri- 
terion were Number-Dot Location (a “paper keyboard”’ test), Arithmetic 
Computations, and Number Comparison, with coefficients of .49, .36, 
and .29 respectively. A multiple correlation coefficient of .60 with the 
criterion was obtained by combining the Number-Dot Location, Arith- 
metic Computations, Tapping, Choice Dotting, Digit-Letter Substitution, 
and Number Comparison tests. However, the final test battery selected 
by means of the Wherry-Doolittle Test Selection Method included only 
the first three of these tests. The final battery exhibited a multiple 
correlation coefficient of .57 with the criterion of proficiency. Applica- 
tion of the Wherry shrinkage formula indicated a population coefficient 
of .54. The writer is of the opinion that both of these estimates are too 
low in view of the homogeneity of the sample used and the errors of 
measurement involved in obtaining criterion scores. Correlations be- 
tween predicted scores and scores made on the separate performance tests 
(which were combined into the single criterion) varied between .67 and .36. 

The test-retest reliability coefficients were found for a sample of 49 
women students enrolled in a university course in tests and measurements. 
Two of the tests, Number Comparison and Arithmetic Computations, 
had reliability coefficients slightly above .90 and all but one of the others 
had coefficients of .67 or above. The reliability coefficient of the final 
battery of three tests was estimated to be .89. 

A higher multiple correlation coefficient than that obtained would 
certainly be desired if this battery were to be used as a sole means of 
selection. However, the lack of variability of the test scores in the 
sample used as well as the nature of the criterion scores may have been 
important in preventing a higher coefficient. Further study should be 
made on the predictive worth of these tests; the multiple regression 
equation here obtained should be tested on other samples. 
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Job Analysis: A Resumé and Bibliography 
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Adequate and accurate job information is of vital importance at the 
present time due to shortages of skilled and semi-skilled workers in 
American industry. Such information, when available, makes it possible 
to cope with these shortages by establishing brief within-industry training 
programs, the breaking down of complicated jobs into less complicated 
jobs, and the transferring of workers with needed skills from non-essential 
to essential jobs. The job information which makes these readjustments 
possible is obtained through a procedure known as “job analysis.” 
Uhrbrock (358) gives a very satisfactory definition of job analysis when 
he states, “Job analysis is a ‘method’ of gathering pertinent facts about 
a worker and his work. The method to be used varies, depending upon 
the objective of the study. Different sources are consulted, and different 
records result, depending upon whether one is using job analysis to devise 
a training program, develop a safety plan, prepare employment specifi- 
cations, or revise a wage payment plan.” 

Factual information uncovered during a job analysis is generally re- 
corded on a “job analysis schedule” or “job specification form’’ which 
contains a preestablished list of items restricted to the activities and 
requirements of the job, such as: Job title; industry, branch and depart- 
ment; sex; age and education; amount of experience in the same, similar, 
and other jobs; equipment used; materials used; surroundings and occu- 
pational hazards; a detailed description of the work performed, etc. 
Ample space is also provided for the recording of unusual information 
concerning the job undergoing analysis. 

The three basic methods of performing job evaluations are the Rank- 
ing, Classification, and Point systems. The Ranking method consists 
in objectively grading the job descriptions according to the value of the 
work, each job being evaluated in terms of other jobs and not in terms of 
salary or wage rates. The Classification method consists of a preestab- 
lished series of categories or classifications to which jobs are assigned. 
The Point system consists of a predetermined list of factors, generally 
common to all jobs, for which a schedule of points have been prepared 
and are to be assigned to each factor. An excellent example of the 
Point system would be the assignment of one point for a grammar school 
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education, or less, two points for a maximum of two years of high school, 
and three points for completion of four years of high school. 

There have been two recent developments in the field of job analysis, 
namely, the Factor Comparison Method of job evaluation, and the 
Functional Pattern Technique of job classification. Briefly, the Factor 
Comparison Method is based upon the utilization of information obtained 
through a number of separate techniques. These techniques may be 
listed as follows: (1) A detailed written job description covering all of the 
important duties and responsibilities of each job; (2) A job-to-job com- 
parison to determine the relative difficulty of the jobs; (3) A comparison 
of the jobs according to those factors which are generally common to all 
jobs; (4) The establishment of a list of “key jobs,” relating wages or 
salaries to job difficulty, to be used as a measuring scale; (4) The intro- 
duction of the element of quantity to each factor by means of the ‘‘Point’’ 
system; (6) A division of each key job wage or salary among the following 
five factors: Mental, Physical, Skill, Working Conditions, and Responsi- 
bility, thereby enabling the quantification of the factor scales; and (7) 
The utilization of salaries or wages of each job as a measure of the point 
values of the jobs. Inasmuch as jobs are constantly changing because 
of technological advancements and other causes, the Factor Comparison 
Method easily copes with the common difficulty of how to classify 
“borderline” jobs, that is, should they be classified as skilled or semi- 
skilled; semi-skilled or unskilled, etc. 

The Functional Pattern Technique, which is utilized after the job 
analysis has been completed, consists in classifying jobs according to 
functional similarities. Jobs may differ according to job description and 
job title and still be closely related insofar as background, training, 
experience, skill, and other fundamental factors are concerned. In many 
instances there may be unrelated jobs in different departments or indus- 
tries possessing practically identical job requirements. When this occurs, 
the desirable procedure is to cover these similar jobs by the same job 
specifications. An example of the utilization of this new technique is 
illustrated in the preparation of ‘Job Families” which is described later 
in this paper. 

The United States Employment Service division of the Bureau of 
Employment Security, Social Security Board, is performing outstanding 
work and research in the field of job analysis through its many field 
offices located throughout the nation. Also used in conjunction with 
the United States Employment Service job analysis schedule is a modified 
form of the Viteles Psychographic Method for studying occupational 
requirements. The “Occupational Characteristics Check List,” as it is 
termed, contains 45 descriptive phrases, such as, eye-hand coordination, 
dexterity of hands and arms, finger dexterity, arithmetic ability, etc. 
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Upon completion of a job analysis, the analyst compares each of the 45 
descriptive phrases with the job and indicates, by checking, whether a 
very great amount of the trait (such jas would be possessed by not more 
than 2 per cent of the general population), a distinctly above-average 
amount (such as would be possessed by the highest 30 per cent of the 
general population, but less than the highest 2 per cent), or an amount 
less than that possessed by the highest 30 per cent of the general popula- 
tion, is necessary for successful performance on the job. 

These descriptive phrases of job abilities, skills, etc., are of value to 
the United States Employment Service in preparing “Job Families” 
which are groups of occupations fundamentally alike, regardless of job 
titles or job descriptions, as far as basic skills, abilities, temperament, 
and other characteristics are concerned. This “Job Family” information 
is used as a guide in the transferring of workers from non-essential to 
essential jobs, selecting workers for retraining, curriculum planning in 
trade schools, and placement and guidance procedures. Other valuable 
outgrowths of job analysis are the “Dictionary of Occupational Titles,”’ 
Volumes I, II, and IV (62, 63, 64), and the “Job Descriptions’ (65, 66, 
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80), which have been 
prepared by the United States Employment Service for the use of public 
employment offices and related vocational services. 

An extensive survey of job analysis literature reveals there are ap- 
proximately 20 uses for job analysis information: (1)) Job grading and 
classification; (2) Wage setting and standardization; (3) Provision of 
hiring specifications; (4) Clarification of job duties and responsibilities; 
(5) Transfers and promotions; (6) Adjustment of grievances; (7) Estab- 
lishment of a common understanding between various levels of workers 
and management; (8) Defining and outlining promotional steps; (9) In- 
vestigating accidents; (10) Indicating faulty work procedures or dupli- 
cation of effort; (11) Maintaining, operating and adjusting machinery; 
(12) Time and motion studies; (13) Defining limits of authority; (14) In- 
dicating cases of individual merit; (15) Indicating causes of personal 
failure; (16) Education and training; (17) Facilitating job placement; 
(18) Studies of health and fatigue; (19) Scientific guidance; and (20) 
Determining jobs suitable for occupational therapy. 

The preceding resumé has briefly considered the values of job analysis, 
the structure of the job analysis schedule or job specification form, the 
basic techniques of job analysis, some recent developments, and some 
outstanding contributions of job analysis. The following bibliography 
covers job analysis literature which has appeared in various journals and 
other publications between the years 1911 to 1941, inclusive. 
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More attention is currently being given to studies of recorded music as 
one of the psychological factors related to the efficiency and satisfaction 
of factory workers. Such factors of course are numerous and complex, 
hence investigations of this type may seem atomistic and feeble. The 
same is true of most studies made as to worker efficiency for they usually 
examine only one specific factor such as rest pauses, illumination, 
humidity, temperature, oxygen, and distraction. The oft quoted Haw- 
thorne experiment, however, suggests that the more significant factors 
in worker efficiency may be far more subjective and elusive forces (19). 
Perhaps music is in this area. 

It is reported that a long time ago there were many kinds of work 
which could not be done without music. Sea chanteys and work songs are 
a part of our national heritage. This seems like a far cry from recordings 
by Tommy Dorsey, electrically distributed throughout the modern fac- 
tory. There is good reason to believe, however, that music in the factory 
tends to add to group morale and esprit de corps, to relax tensions and 
relieve boredom. Questions as to its measured effectiveness naturally 
come to the psychologist and personnel manager, hence studies in this 
field of applied psychology seem appropriate and challenging (14). 

An examination of many of the reports as to the use of music in 
industry shows that they are subjective and that they rest mainly on 
the belief that ‘worker boredom” is due to a consciousness of uniformity 
and repetition in work tasks. The assumption seems to be that any- 
thing that will “‘take the mind away” will reduce boredom. The activi- 
ties most commonly reported to be reinforced by music are the motor 
processes of industrial operations in which auditory components are 
lacking or are at a minimum. 

Perhaps the best known study in this field was made in England by 
Wyatt and Langdon (22). They tested the effect of music upon workers 
engaged in simple repetitive tasks. The output of twelve young workers 
was recorded at fifteen-minute intervals during the day. Five morning 
conditions were tested over a period of ninety-five days as follows: 
Thirty days without music; then fifteen days with thirty minutes of 
music followed after a half-hour by forty-five minutes of music; then ten 
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days with four thirty-minute periods at half-hour intervals; and finally 
twenty-five days without music. 

Wyatt and Langdon reported that the workers increased the average 
output, with the largest increase, 6 per cent, resulting from morning 
music; the smallest, 2.6 per cent, from one period each in morning and 
afternoon. Although the averages show consistent improvement from 
the music, individuals differed in their reaction to it. Because of the 
small number of workers in the experiment, the relative effectiveness of 
types on output is hardly conclusive. The types were reported in the 
following order: one-steps, fox-trots, waltzes, light classics. 

Their study of 350 other factory workers revealed that these em- 
ployees feel that time drags most during the first two hours of each half 
of the work-day, and that from 77 to 97 per cent of these same workers 
feel that they can think of other things while they work and that time 
passes more quickly when they do think of other things. Music’s ap- 
parent favorable influence upon output may be due to its ability to add 
to the imagery in the consciousness of the worker which may be dulled 
by the concentration upon a repetitive task long since mastered. 

The effect of phonograph music upon the output of eighty-eight female 
assemblers of radio tubes was studied by Humes (10) over a period of 
many weeks. His interest was focused upon the scrappage rate and its 
correlation with the presentation of slow music, fast music, mixed pro- 
grams of slow and fast music, and no music at all. Both slow and fast 
music showed less scrappage than the absence of music or than mixed 
programs. Employee morale was reported to be higher with music than 
without it. Humes exercises appropriate caution in interpreting his 
results, pointing out reversals of effect, the specificity of the effect, and 
also the possibility that non-musical factors might have influenced his 
results. 

At a meeting of the American Society of Mechanical Engineers in 
October, 1942, Professor Harold Burris-Meyers (2), of Stevens Institute 
of Technology, spoke on ‘‘Music in Industry.””’ He reported research 
done by himself and Raymond N. Cardinell, of the same institution, in 
several factories. He cautiously comments: 


“The data we have are indicative. They are not sufficient to form the 
basis of unassailable conclusions. . . .” 


In connection with his talk, Professor Burris-Meyers showed several 
charts reporting his chief findings, viz.: 


(1) That daily output of employees plotted against time showed an in- 
crease of 6.8% where music was broadcast, as compared with production 
figures under precisely similar conditions where there was no work music pro- 
vided. In over 75% of all factories studied the total production was found to 
be greater when music was broadcast than when it was not used. 
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(2) The totalfproduction per 100 manhours for a group of approximately 
100 employees of all degrees of experience, in two typical weeks, one before 
and one after a music installation was made, indicated an average production 
increase of 11.4%. 

(3) When production increases were set up in blocks showing the average 
production per 100 manhours in one week, it was reported that, in only one 
week after music was installed was production lower than during the “‘control 
week” before the musical installation. 

(4) In Chart 4 the work production for operations requiring a high degree 
of manual dexterity and sense of timing, there was an average increase of 4.97%. 

(5) In Chart 5 it was reported that absenteeism and early departures 
could be counteracted by use of music. 

(6) The lines in Chart 6 showed the percentage of absences per week for 
four average weeks before, and four after music installations were made, and 
the blocks show the four-week averages. From an average of 22.15% without 
music there was a reduction to a 2.85% average with music. 

(7) In the case of properly planned music programs, as contrasted with 
an average program by any one of many self-styled experts, an increase in 
production of 6.8% was determined when the program was properly organized 
and planned to do the greatest amount of good. 

Considerably less certain is the favorable effect of music on opera- 
tions which require continued mental concentration. In a school experi- 
ment, Jensen (11) found that typing output was somewhat reduced by 
jazz and dirge music. Hevner (8) found that for certain compositions, 
major modality, fast tempo, high pitch, flowing rhythm, and simple 
harmony tend to express happiness best. Podolosky (16) has compiled 
data to show that music tends in general to increase pulse rate, respira- 
tion, and metabolism, lower the threshold for sensory stimuli of different 
modes, and reduce the regularity of respiration. 

Observations and opinions reported in various ways have tended 
to substantiate the thesis that music has several desirable influences 
in industry. Ramsay, Rawson, and others (17) report from a 1939 
British survey that 74.5% of employers using music believe that it in- 
creases work efficiency. Wynford Reynolds (18), in charge of a British 
program, has declared of industrial music, “It is a tonic like a cup 
of tea, something to cheer the mind. You will get increased output 
all right, but it will be spread over the work-spell as a whole. You will 
not necessarily get it while the music is actually being played.” Accord- 
ing to the British Industrial Welfare Society (18), “On the whole the 
concensus . . . seems to be that music at work does much to relieve the 
monotony of repetitive work and produces a stimulus to increased output, 
and in the opinion of the Industrial Welfare Society there is no doubt 
that this development is not merely a wartime one, but that music at 
work will remain a definite feature of industry.” Writing in a lay 
journal, Antrim (1) stated boldly that: 

“In plants up and down the land, I have watched workers coming off one 
shift or going on another with firmer step because of the music. I have seen 
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faces light up when the music comes on in the middle of the day, have heard 
feet tap and lips sing. ‘I like the music,’ said a man on a noisy assembly line. 
‘It’s cheerful, and I have more pep when I get home.’ ” 

In another journal, it was reported that production in British war 
plants has been stepped up between 1214 to 15% for an hour after musi- 
cal programs have been introduced by the British Broadcasting Corpora- 
tion. According to a report in Marketing for October 17, 1942: 

“One of the important minor jobs of the B.B.C. is to keep the industrial 
front working cheerfully and at high speed to produce the required war equip- 
ment,” reads the message. “The Corporation has three programs to achieve 
this end, all based on the view that the music must be familiar to the ordinary 
worker, given either at the beginning of the day’s work or at the end of a 
particularly trying day.” 

“The melody should be clear and well defined, able to ride over factory 
noises. Tone level or volume should be constant, and the tempo or the rhythm 
should create a bright and cheerful atmosphere. Music is best suited for 
workers who are employed on repetition or other monotonous work (especially 
female labor). The tone of an organ is unsuitable for amplification in fac- 
tories, and so is ‘hot’ music or ‘jazzing’ of any melody. Loudspeakers should 
be small and well placed about the plant rather than large and only one or two 
to a department. Vocal items and speeches should be avoided.” 

In a recent study Kerr (13) found that 96% of a group of 162 defense 
trainees reported a favorable average belief in the psychological effects 
of music, 90% expressing a favorable belief in music’s effects on one 
working at a “wearisome, monotonous task,” 77% thinking music im- 
proves one’s feelings “toward the people around you,” 90% that it helps 
“when you are tired,” 87% that it helps one’s nerves, 56% that it helps 
digestion, and 85% declaring that it helps make one forget his worries. 
Factor analysis of the configuration of beliefs suggested presence of an 
efficiency factor and a morale factor. 

Kerr’s (12) findings from a Feelings About Music study of 229 elec- 
trical workers were similar except that a physiological factor appeared. 
He also found that the electrical workers prefer to work on a floor with 
music as opposed to a floor without music. A slight tendency was found 
in several groups for older persons to be less favorable than younger 
persons toward industrial music and females were slightly more favorable 
than males. This type of study needs to be carried on in actual work 
.situations—for different types of work and in different geographic areas. 

There is much opinion but little experimental evidence as to the 
effects of different types of music. Reynolds (18) maintains that British 
experience indicates that slow waltzes, rhumbas, hot music, music that 
is too thickly scored, and vocals should be avoided. In a survey of 
British employers using music it was found that different types of broad- 
cast programs received a rank order as follows: light orchestra without 
vocal, ballroom orchestra without vocal, brass band without vocal, swing 
orchestra or accordian without vocal, small novelty combination featuring 
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xylophone, band with five vocals, light orchestra dance band without 
vocal, theatre organ, rhythmic records of light classical music, dance- 
time records, military band, salon orchestra, dance band, band playing 
folk songs and dances of another nation. Thorpe (21) studied the type 
preferences of 475 college and high school students after they heard an 
orchestra play eight types and found that the top four fell into approxi- 
mately this order: military march, semi-symphony, concert waltz, de- 
scriptive piece (‘““Whispering Flowers”’) ; type preferences were not notably 
related with intelligence, college grades, or curriculum pursued. 

Fay and Middleton (4) played four types of classical and two types 
of popular music over a sound system to 54 college students and the 
students ranked the types from most to least pleasant in this order: 
light classical, old classical, romantic classical, swing, modern classical, 
and sweet; all were rated as more pleasant than unpleasant. Freyman 
(18) reports that analysis of 190 replies received from 250 British workers 
indicates a demand for more modern popular tunes, music-hall songs, 
dance-time music and waltzes and less light classical, marches, and hot 
jazz. All of these 190 replied that they consider music a pleasant back- 
ground to their work. Humes (10) concluded from his study that 
quality of output is generally better under all fast music or all slow music 
than under programs arranged with pieces proceeding from most to least 
familiar. After experiencing several months of music, the British factory 
subjects gave all their votes to fox-trots and waltzes and none to one- 
steps, marches, or light classical. One-steps, however, appeared to have 
the most favorable effect on output (22). 

An article in Modern Industry for September 15, 1942 on “Plant 
Broadcasting” states: 


“One point on which considerable difference of opinion has been voiced 
seems now to be answered by experience. That’s whether or not vocal selec- 
tions are distracting. Although there’s still some conflicting opinion, many 
users agree that numbers with vocal choruses be tn Neh nae music pro- 

ams. The argument against them is that workers listen for the words, thus 
on’t give proper attention to their work. But provided the singing voice is 
not extreme in character, there doesn’t seem to & any noticeable distracting 
effect on the workers—though British plants had to eliminate ‘Deep in the 


— of Texas’ because workers stopped work to clap their hands in the 
chorus.” 


Diserens (3) made observations of dynamometer grips, with and with- 
out music, but this cannot be considered an experiment conducted under 
‘industrial conditions. Similar comment must be made of several other 
experiments (5, 6, 7, 15) but they have obvious implications for industry. 
It might be said that Seashore (20) stated something of an apologetic 
for music in industry when he wrote: 


, Ol 
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“Why then do we love music? Among other things it creates a physio- 
logical well-being in our organism; it is built from materials which are beautiful 
objects in themselves; it carries us through the realms of creative imagination, 
thought, actions, and feelings in limitless art forms; it is self-propelling through 
natural impulses, such as rhythm; it is the language of emotion, a generator 
of social fellowship; it takes us out of the humdrum of life and makes us live 
in play with the ideal. . . .” 

It seems fair to conclude after a survey of experimental literature that 
no highly significant or conclusive research has been published concerning 
the effect of music on the output or health of workers in industry. It is 
a field of research that may be attractive and rewarding for the psy- 
chologist who is interested in industry and in social processes. It is pos- 
sible to note some generalizations that come from experimental work 
and from intelligent observations made by management and labor. 
There is good reason to believe that the use of music in industry does 
relieve boredom and that it facilitates socializing. There is a general 
agreement that, properly controlled, music may increase happiness and 
contentment in work, improve output and lessen fatigue, and make the 
work-setting attractive to applicants. Experience and experimental 
evidence also seem to indicate: (1) music is most often appreciated by 
workers who perform repetitive manual tasks which require little mental 
concentration; (2) the music should be “turned on” only for compara- 
tively short periods and that these should be determined after study 


of fatigue curves; (3) workers believe that music is desirable and help- 
ful in their “feelings” during work hours; (4) music is a hindrance to 
those types of work which demand mental concentration; and (5) as a 
general rule, two or perhaps three periods of music of less than a half 
hour’s duration produce the most satisfactory results. 
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What Are Young People Asking About Marriage? 


Homer L. J. Carter and Louis Foley 
Psycho-Educational Clinic, Western Michigan College, Kalamazoo, Michigan 


In recent years much has been said about preparing young people for 
marriage. Such preparation has seemed to be urgently needed, in view 
of the large proportion of present-day marriages which apparently end 
in utter failure. If, for instance, divorces were four times more common 
in 1930 than in 1890,' it would look as if far too many were entering into 
the married state without being intelligently prepared to maintain it. 
Better knowledge of what is required for success in this basic human 
relationship might well tend to correct the situation in two ways; it might 
prevent the formation of unions which recognition of the facts will show 
to be clearly ill-advised, and it might give to really justifiable marriages 
a much better chance to succeed. 

When young people are thinking seriously about getting married, in 
a democratic nation it is neither possible nor desirable for outsiders to 
make up their minds for them. They must decide for themselves. 
Opportunity may be given them, however, to find out the things that 
they need to know in order to reach an intelligent decision. With such 
an aim in view, it appears relevant to inquire of the young people them- 
selves what sort of information they feel that they need. 

It has been calculated that in the United States the average age at 
which marriages occur is 22.4 years for women and 25.6 years for men.’? 
This age is amply covered by the present study, which has to do with a 
collection of young people between nineteen and twenty-nine. These 
were invited to list several questions which they personally felt to be of 
real importance. They were asked not to sign their names, but the forms 
on which the questions were to be written contained a number of blanks 
to be filled in. Thus they were asked to state their age, sex, race, marital 
status, and academic attainment. Their place of habitation was to be 
indicated under one of five headings: metropolitan area, city under 
150,000, suburban area, village, and open country. They were asked 
also as to their church affiliation and the approximate income of their 
family. The various items of information obtained in this way have 
made it possible to classify these people from several points of view, 

1 Pressey, Sidney L., Janney, J. Elliott, and Kuhlen, Raymond, G. Life: A 
psychological survey. New York: Harper & Brothers, 1939. P. 29. 


* Ibid., P. 27. 
275 








Ee TT TN eT 


276 Homer L. J. Carter and Louis Foley 

and, in connection with any question, to take account of the sort of to 

person who proposed it. ra 
In response to the request, 418 young people, mostly college students, qu 

set down 1426 questions. The sexes were almost equally represented: Gi 


fifty-one per cent women and forty-nine per cent men. The median 
chronological age of both sexes was 21.4 years. Distribution with respect 
to living places showed: 


ETOP OTOP EET TE Ee 22 per cent 
Is > % vos cncdcvadseviseageetesosve flac ee 
RE ES COLE GA A0d (Nice Seweded be CiDhee wae cks a= < 
I Ga uinueld aie bali wi cincbund ab Gbio dita dk bits a ..% 
100 “ sé 


Among thirteen religious groups represented, Methodist, Catholic, 
Presbyterian, Congregational, and Baptist predominated in that order of 
numerical importance. The median family-income was $1411.00. Me- 
dian scholastic attainment was that of a junior in college. Approximately 
8 per cent of those responding reported that they were married. 

For purposes of analysis, the questions which these people proposed 
were classified under six heads: general information, economics, religion, 
sex, children, and health. Sorted into these divisions, the 1426 questions 
were found to stand thus: 


General aspects of marriage......................04. 36 per cent 
Re Cis don ec cheksbsakuascdenedre i 
so a<'5 suv ue 60 4 5aae ba eh sgs ose ee a 
RC ae ee ee bo Sa 8 eee nu“ « 
IETS Gage. UG bess Cévdadwiccwscwndacecsen i 
Physical and mental health......................... eS. < 
100 “c “ 
Questions ranged in frequency from one which was asked as many as I 


fifty-four times to others which were asked only four times. An example 
may here be cited, however, to show how the data were simplified. For 
practical purposes the following questions—along with still others having 
apparently the same meaning—were considered as equivalent: 

Should a woman teach after marriage? 

Should a married woman teach? 

Is it advisable for a woman to teach after she is married? 


Should a married woman seek a career? 
Should a woman who is married work outside the home? 


It seems clear that all these have the same essential content as the 
query, “Should a married woman seek a career?” By analogous inter- 
pretations, the 1426 questions were reduced to 63 basic ones which suffi- 
ciently covered the meaning of all. The 63 were then ranked according 
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to the frequency with which they had been propounded, and these 
rankings were transmuted into units on a linear scale.* Here are the 63 
questions, with the scale-value of each: 


General Information Concerning Marriage 


CON a2 oO PwODe 


. Should a married woman seek a career? (8.2) 
. Should interests of man and wife be parallel? (7.5) 


Should people marry who have great difference in age? (7.0) 
Is a difference in formal education an important factor in determining 
the advisability of marriage? (6.6) 


. How much consideration should be given to the counsel of one’s parents 


concerning marriage? (6.6) 


. Can a lasting marriage be founded upon only respect and congeniality? 


. Can differences of race and social status be overcome? (6.3) 


Should a couple have similar standards of homemaking? (6.0) 


. What makes a marriage permanent? (6.0) 
. Should students marry while still in school? (5.8) 
. What part do “in-laws” play in unhappy marriage? (5.2) 


Is a long engagement advisable? (5.0) 


. What factors should a woman consider in selecting a husband? (5.0) 
. Should a dominant woman marry a man of more submissive tempera- 


ment? (4.5) 


. What factors should a man consider in selecting a wife? (4.5) 
. What are the dangers of a long courtship? (4.5) 

. Should contrasting types marry? (3.9) 

. Should a girl marry a man = 


o she is sure loves her, but for whom 
she has only deep respect? (3.9) 


. What are the most frequent causes of divorce? (3.9) 
. Do married women who work outside the home increase the hazards 


of marriage? (3.9) 


. Does academic training increase the possibility of a happy marriage? 


(3.9 


) 
.-Which should be the older, man or wife? (3.3) 
. Is it wise to marry when the occupation of man and wife will keep 


them apart considerably the first year? (1.4) 


Economic Aspects of Marriage 


24. 


25. 
26. 
27. 
28. 
29. 


30. 
31. 


32. 
33. 


Should a couple wait until they have sufficient funds to establish a 
home, or should they marry and try to get along on less? (9.4) 

How much annual income is needed to maintain < hanadt (8.8) 

How much money should a couple have before marriage? (7.7) 

What are the arguments for and against credit buying? (6.3) 

How should the income be budgeted? (5.7) 

What proportion of the income should be spent for food, shelter, 
clothes, recreation, and other necessities? (5.5) 

How can husband and wife learn to buy economically? (5.5) 

If both husband and wife work, what is the most satisfactory method 
of handling the income? (4.9) 

How can one be reasonably sure of economic security for one’s wife 
and family? (4.5) 

How can husband and wife cooperate in managing their finances after 
marriage? (4.3) 


*Symonds, P.M. Diagnosing Personality and Conduct. New York, D. Appleton- 
Century Co., 1931, pp. 90-92. 
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Information Concerning Religion 


34. Why is re-marriage following divorce wrong from the point of view of 
7 church? (8.2) 


35. Are “‘mixed’”’ marriages a cause of divorce? (7.8) 
36. Why are trial marriages frowned upon by the Christian Church? (7.0) 


Information Concerning Sex 


37. a ad one obtain reliable information concerning birth-control? 

38. What are the arguments favoring early marriage? (6.2) 

39. Where can reliable information be obtained concerning sex and sex 
technique? (5.2) 

40. Is chastity before marriage a vital factor in future happiness and 
successful marriage? (4.8) 

41. Is it advisable for young ae le who are engaged to have sexual inter- 
course before marriage 4.2) 

42. What are the most satisfactory as of contraception? (4.2) 

43. Are vacations from each other advisable for husband and wife? (4.0) 

44. How can husband and wife maintain the romance and charm of their 
courtship? (3.9) 

45. Should both men and women adhere to the single standard? (3.0) 

46. How frequently should husband and wife have intercourse? (2.8) 

47. Is it advisable for married couples to have separate beds? (1.4) 


Information Concerning Physical and Mental Health 
48. era one marry if there is a history of mental illness in the family? 
3 


49. How can one recognize, combat, and prevent venereal diseases? (6.0) 

50. Why should a health certificate be required of both parties before 
marriage? (4.7) 

51. To what extent should one investigate the background of his or her 
affianced? (4.7) 

52. How a2 ‘sa wane and how can one secure a physician to perform 
one ; 

53. Should an unmarried woman resort to abortion? (1.4) 

54. Should one confess his or her sterility? (1.4) 


Information Concerning Children 


55. 7 both husband and wife desire a family before having children? 
6.4 

56. During temporary loss of income should the wife leave children and 
home to work if offered the opportunity? (5.7) 

57. If either husband or wife has a serious physical defect, should they 
have children? (5.2) 

58. How can two normal persons be sure that their children will not be 
defective? (5.2) 

59. How and when can parents advise their children concerning matters 
of sex? (3.0) 

60. What is the best time of the year-to have children, and how many 
years apart should they be spaced? (3.0) 

61. How long should people wait before having their first child? (2.6) 

62. Should parents discuss childbirth in the presence of children? (2.5) 

63. Should Saeed children be informed that they are adopted? (2.3) 
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A comparison of information sought by men and women concerning 
certain aspects of marriage is shown in Table 1. Here it appears that 
the women showed somewhat more interest than the men in questions 
concerning children, in religious problems of marriage, and in economic 
aspects, whereas the men were more inclined than the women to ask 


Table 1 


Classifica- Percentage asked by 
tion of Number of 
Questions Questions Men Women CR* 











49 51 5 
40 60 3.2 
44 56 1.6 
43 57 1.8 
54 46 - 9 
66 34 —3.6 





* CR’s are for percentages in the right-hand column and are positive unless other- 
wise indicated. 
about matters of sex, and especially to make queries dealing with health. 
The sexes show no difference in degree of interest in the general aspects 
of marriage. Interpretation of these data should of course take account 
of the fact that of the persons involved 49 per cent were men and 51 
per cent women. A critical ratio (CR) of three or more suggests that 
the observed difference is not due to chance. 

Table 2 shows the differences in information sought by two age- 
groups, under and over twenty-one years. It will be seen that the more 


Table 2 





me al Percentage Asked by Groups 
oO 
Questions Under 21 Over 21 CR 








44 56 2.7 
55 45 —-18 
43 57 1.8 
62 38 —3.0 
38 62 2.9 
87 13 —8.5 





significant data are those concerning questions about religion, sex, and 
health. While nearly two-thirds of the queries dealing with religious 
aspects of marriage were made by persons under 21, the percentages 
were exactly reversed for questions having to do with sex. An over- 
whelming majority of the queries concerning health were made by those 
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under 21. In connection with these data it is to be noted that 45 per cent 
of the persons were under twenty-one years and 55 per cent were older. 

In Table 3 we have a comparison of the kinds of information sought 
by those living in cities and those in rural areas. Here again it is found 











Table 3 
Percen' Asked by Gro 
Classifica- Num! ~- : = 
tion of of In ‘In Rural 
Questions Questions Cities Areas CR 
ss «Wan awe 513 53 47 —14 
MES gs oe acne 314 55 45 —18 
| RES 171 53 47 —- 8 
IN i in. 5: 5 she 157 39 61 2.8 
ak, satis ae 143 70 30 —4.8 
as as 5 acres 128 65 35 —3.2 





that matters of religion, sex, and health furnish some of the more signifi- 
cant data. Nearly two-thirds of the health questions, and more than 
two-thirds of the sex questions, came from city-dwellers, while people 
from rural areas were not far from being correspondingly conspicuous for 
mentioning religious problems. Of the people included in this study, 
41 per cent were from metropolitan areas or cities under 150,000, and 
59 per cent from villages or the open country. 

Table 4 compares women from metropolitan and rural areas as to the 
sort of information which they sought. Once more we find religious, sex, 











Table 4 
Percen Asked by Women 
Classifi- Number _ E “ 
tion of of In In Rural 
Questions Questions Cities Areas CR 
SE rere 262 57 43 —23 
Eoconomic........... 188 47 53 8 
ES aes 96 54 46 —- 8 
Rs 0a bois bead 89 40 60 2.0 
SE ines bs te wad wks es 66 75 25 —4.0 
Nk 5 ob Sa vanes 44 80 20 —4.0 





and health factors predominant. City women asked three-fourths of the 
questions concerning sex, and a still larger proportion of those having to 
do with health. On the other hand, women from rural areas contributed 
a majority of the queries involving religious issues. Account is to be 
taken of the fact that 41 per cent of the women stated that their homes 
were in cities, and 59 per cent that they lived in rural communities. 
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Table 5 deals similarly with men from the two kinds of localities. 
In this table the more significant facts have to do with questions about 
economic, religious, and sex aspects of marriages. Two-thirds of the 
queries involving sex, and nearly as large a proportion of those involving 
economics, were made by men living in cities. Conversely, an almost 
equally large majority of questions concerning religious issues came from 
men whose homes were in villages or the open country. 

The results of this study suggest that the sex aspects of marriage may 
be less significant than the economic factors. Concerning the former, 


Table 5 


Percentage Asked by Men 
Classifi- Number 


tion of of In In Rural 
Questions Questions Cities Areas CR 











47 53 1.0 
63 37 —2.9 
50 50 0 

66 34 —2.8 
52 48 - 3 
38 62 2.0 





more questions were asked by persons over twenty-one than by those 
younger, and more by people reared in cities than by those from rural 
communities. 

Economic questions were raised more by women than by men, and 
more by persons under twenty-one than by those who were older. Per- 
haps the inference to be drawn here is that women feel less familiar with 
business affairs, and therefore more in need of information, and that 
experience may enable one to solve some fundamental economic problems 
for himself even within a very few years. It is understandable also that 
men reared in cities, where competition is keener and financial pressure 
more intense, should be more concerned with the economic aspects of 
marriage than those living in villages or out in the country. 

Religious issues connected with marriage seem to occupy women more 
than men, young persons more than those older, and both men and 
women from rural areas more than those living in cities. These indica- 
tions are quite in line with what one might naturally expect. It is 
common knowledge that women are more faithful church-goers than 
men, and more inclined to be influenced by religious sanctions. By a 
certain age, which of course varies with the individual, most people have 
arrived at some sort of religious attitude, for better or for worse. In 
cities, where a person is constantly dealing with people of various faiths, 
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there is likely to be an easier tolerance of differences in creed than in rural 
communities where prejudices may be strong and deep-seated. 

Questions concerning health were asked chiefly by men, by both sexes 
under twenty-one, and by women living in cities rather than by women 
of rural communities. Here again certain inferences suggest themselves 
readily enough. Since people naturally think of asking questions about 
things of which they feel ignorant, it is not hard to understand why 
young men should be particularly concerned about matters of health. 
Nursing is traditionally a feminine thing, and boys as they grow up are 
accustomed to having their health looked after by mothers or older 
sisters; consequently they feel more or less helpless in the face of health 
problems which they have not been used to solving for themselves. We 
realize also that women living in cities are constantly having matters of 
hygiene brought to their attention by groups and agencies of various 
kinds, so that they think of such things more often than do women in 
more sparsely-settled places. 

Many of the questions show immaturity on the part of the questioner. 
If these people were fully mature, however, they would hardly be likely 
to stand in any great need of counsel concerning the principal aspects of 
the marriage relation. The fact is that they are not, and could scarcely 
be expected to be. To those who are interested in supplying young 
people with the information and guidance which they require, this study 
may perhaps furnish some useful indications. Evidently the things that 
many young people wish to know involve matters of physiology, psy- 
chology, sociology, economics, mental hygiene, and religion. 





Personality Tests of Partially Sighted Children * 


R. Pintner and G. Forlano 
Teachers College, Columbia University 


Do group tests of the inventory type reveal any differences between 
children in sight conservation classes and those in the regular classes? 
Does a minor physical handicap such as partial loss of vision cause any 
change in personality make-up, so that it can be measured by a per- 
sonality inventory? 

Two group tests were given to children in sight conservation classes, 
namely, The Aspects of Personality Test ' and the Pupil Portraits Test.? 
The first test attempts to measure three traits: (1) Ascendance-Sub- 
mission; (2) Extroversion-Introversion; (3) Emotional Stability. The 
Pupil Portraits Test measures adjustment to school and home situations. 
It allows the subject to indicate how he feels with reference to many 
situations in his home or school. 

A total of 874 tests was given. The grade and sex distributions of 
the children tested were as follows: 


Pupil Portraits 


Aspects 
Girls Boys Girls 
Grade 4 and below 38 42 33 
Grades 5 and 6 108 103 114 
Grades 7 and above 65 72 68 


Children in sight conservation classes are generally not allowed to 
read ordinary print, although most of them can do so. In order to make 
sure that the children would not be handicapped by the use of the stand- 
ard test blanks, special permission was received from the publishers to 
reproduce the tests in enlarged form for this experiment. The enlarged 
print of the special typewriter for sight conservation classes was used 
and a mimeographed edition was prepared. This resulted in tests easily 
legible by all children. 


* Some help in gathering this material was rendered by a W.P.A. project under the 
direction of the authors, but the project was closed before the original plan could be 
completed. Nevertheless, the writers wish to acknowledge this help given by Project 
No. 65-1-97-21 W.P. 10 of the Works Progress Administration. 

1 Pintner, R., Loftus, J. J., Forlano, G., and Alster, B. Aspects of Personality Test. 
New York: World Book Co., 1938. 

* Pintner, R., Maller, J. B., Forlano, G., and Axelrod, H. C. Pupil Portraits: 
Test and Manual. New York: Teachers College Bureau of Publications, 1934. 
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The Results - 
Table 1 shows the results on the Aspects of Personality Test for our in 
cases after the raw scores have been translated into percentile scores fo 


according to the norms given by the authors of the test in the manual of 
directions. Because there are separate norms for boys and girls, we have 
calculated the medians and the quartiles for both sexes. 











Table 1 ” 
Percentile Scores of Partially Sighted Children 
Aspects of Personality Test 
Boys Girls 8 
Grade Qa M Qs Q M Qs 
Ascendance I 
DOP ok ois ote 24 42 78 18 33 69 
MN, Se thse omc dwn 42 59 78 23 51 79 
a ere 37 63 77 27 48 72 
Extroversion 
MPMI 5 o'5.0 uc o Ras eae 27 58 76 8 16 58 
GMS Cia ok ba 68 08S 27 58 83 11 35 68 
PD, 0 ons eb Rabe 22 42 79 11 43 81 
Emotional Stability 
Ce RE eee 17 36 69 21 37 78 
S| NE ae ere? 25 58 83 28 69 90 
Ts 6 a vibe nn ca ven 12 42 75 15 49 83 





An inspection of the medians and quartiles shows that the boys as a 
group approximate the norms more closely than do the girls. The great- 
est deviation for the girls is in the measure of Extroversion-Introversion. 
In this trait, the partially sighted girls tend toward introversion. The 
medians are all below the 50th percentile, and all of the quartiles, except 
one, are below normal expectancy. 

Looking now at the results for the grades, we may say that the results 
for grades 5 and 6 are most nearly equal to the norms, whereas grades 4 
and below deviate most from the norms. Our largest sample of children 
occurs in grades 5 and 6, and this sample is probably most representative 
of partially sighted children. In general, therefore, we may conclude 
that partially sighted children on the average show no marked deviations 
from normal children in certain aspects of personality as measured by 

group personality inventories. The most probable group deviation is 
| shown by the p%rtially sighted girls who seem to be more introverted 
than normal girls. 

Table 2 shows the percentile scores for the Pupil Portraits Test. 
There is practically no difference between the percentile ratings for the 
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boys and the girls. Most of the medians are in the forties or fifties 
indicating no great divergence from the norms. The lowest medians are 
found for the children in grades 4 and below. This may be due to diffi- 


Table 2 


Percentile Scores of Partially Sighted Children 
Pupil Portraits Test 





Boys 
Grade M Q; Q: Q; 








School Adjustment 

45 15 
65 20 
70 28 


50 
70 
70 


65 15 
65 25 
80 30 


55 
70 
70 


S55 S58 


i] 
on 


48 17 
70 24 
75 28 


50 
75 
75 


SEF SER BES 


aS Ss 





culty of understanding the language of the test, as the authors of the 
test do not recommend it below the fourth grade. 


Comparison with Hard of Hearing 


Pintner * gives the results of the Aspects of Personality Test, admin- 
istered to 1171 hard oi hearing and 1208 normal children, most of whom 
were in grades 5 and 6. It is interesting, therefore, to compare the 
results obtained by him with our present sample of partially sighted 
children. Since he gives his results in raw scores only, we have also 
used raw scores. Furthermore, we have used our results for grades 5 
and 6 only, since the vast majority of the hard of hearing children was 
in these two grades. Table 3 gives the medians or means for this com- 
parison. The normal group in this table is the control group used by 
Pintner in his study of the hard of hearing. It is not the standardization 
group used for norms in the test manual. 

Table 3 gives us a general picture of similarity rather than differen- 
tiation among all the groups. So far as this group test can measure the 
three personality traits in question, we may conclude that partially 
sighted and hard of hearing children react to it in very much the same 


* Pintner, R. Some personality traits of hard of hearing children. J. gen. Psychol., 
1942, 60, 143-151. 
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Table 3 
Comparison of Partially Sighted and Hard of Hearing 
Boys Girls 

Ascendance 
Partially sighted. Median................... 17 15 
Hard of hearing. Mean....................-. 18 16 
UL: SUNOS dc. b c wack von denen 406040080 18 16 

Extroversion 
Partially sighted. Median................... 21 19 
PE Rn his chee cccwoesccses 21 21 
SO TS oo 0 59a dad +i po Kauweseh saan 22 21 

Emotional Stability 

Partially sighted. Median................... 27 29 
ey oer 24 26 
eh MS Pete ewe Vike ndeeus cansaboeeds 25 26 





manner. Slight handicaps in visual acuity or in auditory acuity do not 
lead to marked differences in personality traits as measured by person- 
ality inventories. There are, however, in our results slight differences 
between the groups which may be suggestive. The partially sighted, 








Table 4 
Comparison of Partially Sighted and Hard of Hearing 
Pupil Portraits Test 
Boys Girls 
Total Adjustment Score 
Partially sighted. Median................ 82 87 
Hard of hearing. Mean.................. 76.7 83.4 
Ns NR hin os dee pe dikccascwisen 79.5 84.0 
School Adjustment Score 
Partially sighted. Median................ 61 63 
Hard of hearing. Mean.................. 56.2 61.1 
SRS ao ob sive Gane wehbe dene cde 58.3 61.6 
Home Adjustment Score 
Partially sighted. Median................ 22 23 
Hard of hearing. Mean.................. 20.5 22.2 
TR, DOO. coer Hees civsinne cee 21.2 22.4 





both boys and girls, score higher in emotional stability than any of the 
other groups. They are slightly lower in ascendance, but the difference 
here is small. The partially sighted girls are lowest in extroversion; they 
tend to be more introverted. 

In a similar manner we may compare the partially sighted and the 
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hard of hearing on the Pupil Portraits Test. Pintner ‘ in another article 
gives the results for 1.397 hard of hearing and 1.604 normal children 
tested on this test. Again we use raw scores for this comparison and 
use only our sample of cases in grades 5 and 6. Table 4 shows the 
medians or means. 

The partially sighted test slightly above the normal and the hard of 
hearing slightly below. These differences are not great. The feeling of 
greater satisfaction or contentment registered by the partially sighted as 
compared with the hard of hearing is, however, in line with common 
observation, namely, that losses of audition are more likely to cause 
irritation than losses of vision. Pintner divides his hard of hearing group 
into those having less than 15 decibels loss and those having more than 
15 decibels loss. If now we compare our visually handicapped cases 
with the latter group we find that the gap between the visually and 
auditorially handicapped increases slightly. 


Total Adjustment Score 


Boys Girls 
Partially Sighted—Median....... Saidaws ou awane 82 87 
Hard of Hearing—Mean...................... 75.4 81.9 


This hard of hearing group is probably more comparable to our partially 
sighted group from the point of view of the severity of the physical 
handicap. 

Conclusion 


In spite of the slight differences we have pointed out, the major 
emphasis of this. study must be placed upon the similarities of, rather 
than the differences between, children with slight visual or auditory 
defects and children without such defects. The slight physical handicap 
does not seem to result in a distinct personality pattern or in an inability 
to feel satisfactorily adjusted. Personality traits, such as are measured 
by the tests we have used, vary in amount from child to child and from 
trait to trait among visually and auditorially handicapped children in the 
same manner as they do among so-called normal children. 


‘Pintner, R. An adjustment test with normal and hard of hearing children. 
J. gen. Psychol., 1940, 56, 367-381. 








Sex Differences in Food Aversions 


Richard Wallen 
University of Cincinnati 


That men hunger is largely a biological phenomenon; that men hunger 
for specific foods can be understood only by searching beyond biological 
facts. Feeding activity ranks with sexual behavior as a demonstration 
of that peculiar and delicate interaction of biological, psychological, and 
cultural influences so often found in the study of human wants; yet the 
investigation of food habits lies almost neglected by psychologists. Some 
factual material is presented by child psychologists (1, 6, 10), but for the 
most part they are primarily concerned with the nutritional aspects of 
child training. A search of texts in general, social, and applied psy- 
chology reveals the meagerness of data relevant to the psychological 
problems involved. There is even a scarcity of speculation. 

Although clinical experience has left little doubt of the importance 
of the family in modifications of eating habits, few will question the 
proposition that even broader social and cultural factors are significant 
determinants of food acceptance and rejection. Remington’s assent to 
the belief is given in a speculative paper on “Social Origins of Dietary 
Habits” (7). Townsend (8, p. 65) expresses the hypothesis in concrete 
terms: ‘The educated cosmopolite does not hesitate to try strange foods. 
Not so the savage, the child, and the ignorant. In these three classes, 
food prejudices—often curious and irrational—abound.”’ Several inves- 
tigators (2, 6) have even examined possible processes through which 
social determinants may become effective. Little has been done, how- 
ever, to determine what foods actually are liked or disliked by adults and 
to relate such findings to social or cultural backgrounds. 

One problem among the many related to food prejudices is that of 
determining sex differences in aversions. About the only study available 
on this point is by Tussing (9). His subjects were 184 males and 214 
females, all attending the same college. He administered a check-list 
containing 289 food items (not reported) with instructions to mark each 
one as liked, disliked, indifferent, or never tasted. Only data concerning 
likes and dislikes were analyzed. The results showed that about 60 
per cent of the items were disliked by a larger proportion of women than 
of men. More men than women indicated dislikes for 28 per cent of the 
items. Comparing this finding with his results for the foods liked, 
Tussing (9, p. 199) concludes: “. . . women are relatively stronger in 
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their dislikes; men in their likes.”” Since ratings of the strength of the 
dislikes were not made, it is hard to see how the data lead to the above 
conclusion. Tussing has shown, rather, that a larger number of foods 
are more commonly disliked by women than are disliked by men. 

It is the purpose of the present paper to report a study of sex differ- 
ences in food dislikes which used a more satisfactory sample than had 
heretofore been employed. In addition, data will be presented on the 
extent of the prejudices against various foods. 


Procedure 


Check-list. A check-list of 143 foods was prepared and attached to a 
sheet containing questions on sex, date of birth, father’s occupation, 
amount of travel, and other information. The following instructions 
were printed on the sheet: 

A list of foods is on the next page; you are to indicate those you refuse to 
eat. Do this as follows: Read over the list slowly. When you come to the 
name of a food you dislike very much, put a circle around the number of that 
food. In order to determine whether you dislike a food very much, decide 
whether you would refuse that food because you dislike its taste. If you are 
not allowed to eat a food because of your health or your religious beliefs, do 
not mark it unless you personally dislike its taste. It must be assumed that 
each food is properly prepared. Furthermore, your dislike must be strong 
enough that you will not eat the food no matter how it is prepared. For 
example, one person will not eat peas unless they are creamed. Since he does 
eat prepared this way, he should not mark peas as a disliked food. After 
marking all the foods you do not like, go back over the list and draw a circle 
around the names of all foods you have never tasted. 


It will be noted that these instructions were designed to make the 
subjects mark their strongest dislikes or aversions. Although emphasis 
was placed on dislike for the taste of a food, we are not at all inclined to 
believe that responses to the questionnaire were determined exclusively 
or even largely by the gustatory and olfactory components of food ex- 
periences. Perhaps it is even true that in the phenomenal taste of foods 
these components cannot be separated from other factors. It is unlikely 
that persons who are reminded of worms when they see spaghetti have 
gustatory experiences similar to those of people who fail to see the resem- 
blance. Our emphasis served, rather, to reduce the total number of 
responses and thereby to yield data on attitudes strong enough to be 
called aversions. Further, with the exception of eggs and soups, the 
instructions ruled out the effects of various ways of preparing the same 
food and focussed attention on the food itself. The question of the 
extent of dislikes for various modes of preparation is not unimportant, 
but it seemed advisable to use the more general approach first. 

It cannot be denied that the use of a check-list entails disadvantages. 
Strictly speaking, we are not dealing with actual eating behavior but with 
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reports about it. The way is open for subjects unintentionally to mini- 
mize their dislikes or to misunderstand the purport of the instructions. 
If food rejections in an actual eating situation were studied, such errors 
_might not occur. But the eating situation is not as free from social 
' pressure as the checking situation is, and particular ways of preparing 
foods (e.g., too salty, too cold) could unduly alter avoidance responses 
and produce unreliability. Another consideration is that to study actual 
rejections in the presence of the number of foods on the check-list would 
require a long time for any one subject—time during which aversions may 
change. The check-list, then, provides a cross section of each person’s 
dislikes at a specified time and allows the investigation of a broad sample. 

In order to encourage sincerity in answering, subjects were not asked 
to sign their names on the lists. All administration was done by trained 
psychologists.! 

Subjects. A total of 545 persons engaged in either full- or part-time 
college work filled out the check-list during the academic year of 1941- 
1942. The subjects were drawn from Northwestern University, West 
Virginia University, Cleveland College, the University of Alabama, 
Boston Evening Classes, and the University of Cincinnati. To eliminate 
the possible effects of age differences an attempt was made to limit the 
subjects to students in the first two years of college. A few individuals 
older than 30 years replied, but three-quarters of the subjects were be- 
tween 18 and 25 years of age. 








Table 1 
Medians for Age, States Visited, and Number of Dislikes for Each Group 
F-1 M-1 F-2 M-2 
MLE ic) chs gins so ce Sha ee ad 179 111 129 126 
OI ia hia ds ile liaad o tibatae 20.9 21.4 20.5 21.4 
States visited........... 15.0 12.4 7.9 9.9 
REDS So ''s.s'w ms ba tides 10.1 9.8 10.1 6.8 





Replies were classified into four groups according to the sex and 
socio-economic status of the respondents. Using the list of occupational 
prestige values found by Hall (4), we split each sex group into a high and 
a low socio-economic group on the basis of father’s occupation. Since 
we were dealing with a college group which included few persons of very 
low status, the term “low’’ should be understood in a relative sense. 
That our crude dichotomy has some validity is indicated by the fact that 

1The author wishes to acknowledge his debt to the following persons for their 


generous assistance in collecting the data: Dr. Claude Thompson, Dr. J. E. Janney, 
Dr. Ellwood Senderling, Dr. Quin Curtis, and Dr. O. H. Mowrer. 
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the lower status groups had traveled in fewer states than had the higher 
ones (see Table 1). Although our present concern is not with socio- 
economic status in relation to aversions, it seemed advisable to form the 
sub-groups in order to control one more variable and to give a clearer 
picture of our sample. We shall refer to the higher and lower status 
groups of the females as F-1 and F-2 respectively, and to the correspond- 
ing male groups as M-l and M-2. For a summary description of the 
four groups Table 1 should be consulted. 


Results 


Each check-list was scored for total number of dislikes by counting 
only those items which the subjects had tasted but disliked. The dis- 
tributions of these scores are skewed and extend from 0 to 66 dislikes. 
Means for the four groups are: F-1, 13.1; M-1, 12.6; F-2, 13.4; M-2, 9.9. 
The critical ratio for the difference between the means of the first two 
groups is .36, while that computed in the case of the last two groups is 
2.76. For total dislike scores, it appears, a reliable sex difference exists 
when the two low status groups are compared but is not found when the 
high status groups are compared. 

The next step in the analysis was to compute the percentage of per- 
sons in each group which disliked each item. Only persons who had 
tasted the item were considered in calculating the percentage values; 
those who reported that they had never tasted the food were omitted. 
When it is said that 25 per cent dislike a food, that statement should be 
interpreted to mean that of all who have tasted it, 25 per cent dislike it. 
The percentages (rounded to the nearest 1 per cent) disliking each food 
are listed in Table 2.2 Since different numbers of subjects have tasted 
different foods, the 100 per cent value is not the same throughout any 
one group. 

Inspection of Table 2 reveals rather striking uniformity among the 
groups in the extent of the prejudices against various foods. Differences 
between males and females of the same status seldom exceed 15 per cent 
and are usually less than 5 per cent. This uniformity is in part due to 
the type of items included in the check-list, but it is in part evidence of 
the cultural homogeneity existing in our sample even though it was drawn 
from several different regions of the country. 

The differences between the number of dislikes for males and females 
of each status group were tested, item by item, for statistical reliability. 
We found that 10 of the comparisons between groups F-1 and M-1 
yielded reliable differences (C.R. equals at least 2.0). They occurred in 


2 Item 9 was misprinted on the check-list and was not included in the tabulations 
nor in Table 2. 





aw 


~—t 








292 Richard Wallen 
Table 2 
Percentage of Persons Disliking (Out of Total Number Tasting) 
Item F-1 M-1 F-2 M-2 
RSMO: « Ukass cob btesvaw ees I 8 12 12 
D Tis diay ass vas Be aides 4 5 6 
i, NS i octus spaaenasesel 1 2 1 
4. Ba so ac enseesesiak 2 0 0 
PSs wien cisahse arene 0 0 0 
Gy Ms cn vicncaccdubaxdes 1 8 3 
Fy IS a 6.00 d:tennceaeds 4 4 6 
8. Grapefruit juice............... 4 5 4 
SR ere 5 i) 5 
11. Cream of mushroom soup...... 15 27 ll 
12. Cream of tomato soup......... 2 2 4 
13. Cream of asparagus soup....... 17 22 12 
RG) TRE, ik ac is i ve ein ss 0 10 6 
15. Chicken noodle soup.......... 2 2 2 
16. Green pea soup............... 6 16 6 
ye ROT oer re 4 9 8 
TE SO 4 10 5 
IS 545 g-n:0 3: 0s np aie each as 8 11 3 
FITTER ee oe 20 34 22 
ih ID 0G os soc naeetedinneis ais 19 20 16 
ee I, oi 6s gachecpaeiuan inks 16 13 6 
EER RRRE 14 8 6 
24. Finnan haddie................ 22 17 13 
RS ee ee se 1l 8 6 
WE: TINS oo oss Sock. Gorn 6 3 7 
De a ess trap eedibatuanu 7 6 4 
ESTEE EE TN 6 4 6 
SE, oS 50k 0 5s cneueenres 15 10 13 
SRS Faso dSsek Cuekticeurs 25 26 23 
i MKS obs sk can eaadserreen 20 19 17 
EP eee Ort 24 22 22 
i ER sob ns dnien ee ekoene 6 4 3 
a oa tlie ia dled bik Wee 37 31 16 
SIND. 6 ¢ 4. 9:A's Klou Kas oe 8 0 18 15 16 
PM Sey ccbbaceteusaceees 22 22 7 
ee CNG os aks bias wh VOC ORS 41 33 25 
SETS: Oy Pap er oe 21 18 16 
SEG Pree ® See ee 20 14 14 
ER 0's p00 dime teen Dorsch Sie 1 0 0 
ee Is wo wns antidote 65 0's 6 6 6 
42. Leg of mutton. .........5.066: 10 9 8 
rr eee 4 2 2 
Ge GE i pin 0s sc civ bans os 5 2 2 
WN a0 neuen oneness eccaws 2 2 1 
Oe Ns rk FRR AS 3 2 2 
Ts CS 5 5 cawcd ebivetanasecden 2 3 1 
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Table 2—Continued 
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Table 2—Continued 

Item F-1 M-1 F-2 M-2 
i NONI 6a 3. uc Ua eee wee ees 15 12 15 9 
97. Brussels sprouts.............. 10 16 14 8 
Bs ING ys ds be 've cp ob edcbieen 6 10 1 4 
Ce Sues es aawcaaeaed On 2 4 1 3 
ee Fe eee 8 16 1l 13 
Se, GS, .<.o cacedat bakes cz 1 3 2 1 
pO ER eo er ee 18 12 12 4 
Me MCS ks oc bea eee amaeeas 1 1 0 1 
Sh: MO 2... Kalas ew eS ko 44 18 44 15 
Se IC CS 0. cobb cb oee scabs 9 12 8 6 
106. Dandelion greens............. 30 22 27 21 
FP I web etNis ic bccdanabd 29 31 30 24 
Sh SN oS as Osa b eek edanc wane 12 10 16 12 
Pe, I a2 ce nba tWetaak se 10 12 10 ° 9 
Cen MND, oe Sok sean ohnwiheees 1 0 0 2 
RS Coosa sk sw vey whee Meee ee 21 13 24 12 
Ee ee eee 15 16 17 9 
Pes er re oe 20 17 19 15 
RNS 6 oss 58s cea beebea wes 23 22 24 12 
RN CeO ss. we 4 gk ea en ace 2 0 2 0 
116. Green peppers................ 7 14 5 8 
ee sa a 5s cS brancstnn ovcank 4 6 2 2 
Se es cae, enh es 10 7 8 6 
Rs Hi iN « o0's0 0 eS 0b baeeu 19 24 25 22 
120. Sweet potatoes............... 7 4 5 7 
ES CEL PPE COSTE 4 0 2 3 
I 4's 6ceadensetovbenské 18 21 13 16 
PS... os cackarnie ews ds 9 10 4 s 
EEF ee eee 22 26 22 ll 
:, soc cdnu Rindge hae 2 0 1 2 
ESR ree, Se eae 1 0 0 0 
127. Cantaloupes.................. 4 1 3 2 
128. Cranberries....... 5 6 4 5 
RE Tere ee 4 6 5 4 
ST MINIS... 5 a's) v0 biece ne sccialocete eee 1 0 1 1 
Ree NR. sin ev a cca neatn es 2 1 2 2 
SBS; Tibemerrias... . . ooo. ccc etn 6 6 8 3 
SS SS ES AAR: Teme 10 10 9 6 
BG. Giveein GHWGR. . co. 5 oc Sec ceccse 10 17 14 13 
EE. EOE 21 31 25 19 
SES eee 2 0 0 0 
er ee eee 1 1 0 1 
138. ee hier ieinse 1 1 0 2 
139. Strawberries.................. 1 0 0 1 
BE IN © Sob w oie ene hose es 4 0 2 4 
DN. iii ven sds cauecss 1 4 3 2 
142. Huckleberries................. 2 2 5 2 
i iso alas pment wo keke 0 0 2 1 
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the case of coffee, Scotch broth, caviar, clams, hominy, spaghetti, arti- 
choke, tomatoes, cantaloupe, and watermelon. In only two of these 
comparisons (clams, artichoke) did more males than females dislike the 
food. For the comparison of groups F-2 and M-2 the following 15 items 
showed reliable differences: cream of mushroom soup, cream of asparagus 
soup, green pea soup, bass, clams, lobster, pigeon, quail, venison, rabbit, 
limburger, dasheens, kale, parsnips, and avocado. In all 15 cases a 
greater per cent of the females than of the males disliked the food. Thus, 
aithough males do not differ greatly from females in their aversions, 
females are somewhat more prone to have aversions than males. 

Another type of analysis was used to reveal the sex differences in the 
data. For each item two comparisons of the sexes are possible. Entirely 
aside from the question of statistical reliability, we may study the con- 
sistency of these two comparisons. A greater proportion of the males 
than of the females in both status groups showed aversions in 12 instances 
(8.5 per cent). For 43 items (30.3 per cent) the percentage of females 
disliking was greater in both comparisons. In the remaining 87 cases 
(61.2 per cent) no consistent trend appeared. Despite the fact that no 
clear sex differences were found for the majority of the items in our list, 
we may conclude that, when consistent differences are present, prejudice 
is more likely to occur among females than among males. 

To discover which foods create the most general prejudice a list of 
the ten most commonly disliked foods was constructed for each group. 
Six items were in all four lists: buttermilk, brains, limburger, pig’s feet, 
tripe, and kidneys. Beef heart occurred in three of the four lists. It is 
undoubtedly significant that internal organs appeared so frequently 
among commonly disliked foods. More detailed studies of these and 
other widespread aversions would probably reveal the importance of such 
factors as symbolism and identification in their formation. In interesting 
contrast to foods widely disliked are those for which few people have 
aversions: orangeade, lettuce, apples, pears, and plums. They appear 
to be characterized by mildness of flavor and, perhaps, by a lack of 
unpleasant associations. 


Discussion 


In general, the data of the present study agree with those of Tussing 
in showing that food dislikes are somewhat more common among women 
than among men. We must recognize, however, that the few reliable 
differences are not large. Our findings pose the problem: How can we 
account for the fact that in some instances aversions are more prevalent 
among the females than among the males and yet explain why such cases 
are so infrequent? 
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One possibility is that there are no sex differences in actual aversions, 
and our data only reflect the greater reluctance of males to admit their 
prejudices. In this connection it should be recalled that the check-lists 
were unsigned—a procedure designed to reduce unwillingness to admit 
dislikes. But if it be urged that our data mean that there are sex differ- 
ences only with respect to accuracy of report, we should reply that the 
same pressures which prevent males from confessing their dislikes on a 
check-list would be operative in actual eating situations. In fact, it is 
likely that these influences would be stronger in actual situations and 
would tend to force eating of the disliked food; then it could not be con- 
sidered an aversion in the sense of our instructions. Behavior may be 
as deceptive as reports about it. 

Assuming that our results indicate genuine differences in food dislikes, 
we see two possible lines of explanation: the biological and the social. 
The first would suggest that the sexes probably differ in their gustatory 
and olfactory thresholds; females are, perhaps, more sensitive to intense 
stimulation. Foods with strong tastes would then prove to be more un- 

) pleasant to females than to males and consequently would be rejected. 
tH Aside from the equivocal status of sex differences in the thresholds just 
4 mentioned, we may suggest that if such differences were responsible for 
a food prejudice we should expect to find many more differences in aver- 
unt sions than is actually the case. Further, inspection of the items which 
cdi i show reliable differences between the sexes does not support the notion 
that differences would be most marked in the case of strong-tasting 
foods. The hypothesis cannot account for the differences in dislikes for 
such mild foods as mushroom soup, asparagus soup, quail, and water- 
melon, nor can it account for the failure of clear differences to appear in 
the case of cheeses such as cheddar, camembert, and roquefort. Of 
course, it is expecting too much of any explanation to insist that it account 
for each specific sex difference; we simply wish to point out the difficulty 
of fitting our findings into a scheme which assumes differences in thres- 

holds to be the deciding factor. 

Another possibility derives the differences in aversions from differ- 
ences in the social pressures to which the sexes are subjected. Let us 
begin by assuming that avoidance responses to foods are equally fre- 
quent in the two sexes early in life. Soon, however, differential treatment 
is given to the two groups. If this difference were such as to permit the 
retention of avoidance responses by girls and to discourage their per- 
sistence in the case of boys, we should have made a beginning in the 

| explanation of later differences. Such differential treatment does appear 
to exist. There is a certain stereotype into which boys must fit: they 
should refrain from displaying fear and tears, stand pain and discomfort 
| “like a man,’ refuse to shrink from new experiences, and so on. The 
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stereotype for girls differs, in that, although there may be no attempt to 
encourage avoidance responses, there is a much more permissive attitude 
toward them. We permit little girls to fear “bugs” but discourage 
similar timidity in boys. 

A little later, “‘manliness’’—in all its varied distortions—has value in 
its own right and is actively promoted by social pressures from one’s 
fellows. G. Stanley Hall (5, p. 253) has given an account, perhaps 
exaggerated, which specifies some sex differences in adolescents: 


There is a new tendency to experiment, not only with new dishes, but often 
with things strange and even offensive. Boys dare each other to taste, eat or 
swallow offensive and sometimes harmful things, or force their mates to do so, 
sometimes with disastrous results, not infrequently suggesting the nauseous 
anthropological chapter of scatology. Boys sometimes affect or boast of their 
achievements in eating, and girls affect daintiness, becoming exceedingly dis- 
criminating in sweetmeats, bonbons, summer drinks, etc. Boys have eating 
and drinking matches and duels. . . . Girls in particular become squeamish, 
fastidious, and lickerish, and perhaps develop a sweet tooth of disproportionate 
dimensions.* 


The result of these pressures to act in certain ways would be—so far 
as our problem is concerned—to force boys repeatedly to eat foods they 
may dislike. Girls, on the other hand, would not repeat unpleasant food 
experiences nor seek new experiences to the same extent. 

At this point a new factor of importance appears: repetition. An 
investigation by Gauger (3) indicates that repeated presentations of an 
unpleasant food may modify young children’s responses to it in the 
direction of indifference or even of pleasantness. Unfortunately evidence 
is lacking which would show that alteration of tastes is possible by means 
of sheer repetition in the case of adolescents or adults. Nevertheless, 
it is a not unreasonable supposition that some such process may take 
place. Our account pictures boys, motivated by the prestige of fitting 
a stereotype, tasting disliked foods until they become indifferent to them. 
Girls, only weakly encouraged to be “hardy,” continue to reject foods 
they dislike and retain their prejudices. 

It follows from our discussion that as the stereotypes of the sexes 
become more and more alike with respect to ‘“‘being a good sport” and 
“taking it on the chin,”’ differences in food aversions will become fewer. 
Presumably there is an increasing similarity in the stereotypes for our 
own culture. Thus, the small number of differences found in the present 
study reflects certain residual differences in the behavior patterns which 
the sexes are expected to assume. If there is a continuing tendency to 
expect similar behavior from boys and girls in situations of fear, disgust, 


* This passage calls to mind the gold-fish swallowing so fashionable a few years 
back. Apparently the sport was less prevalent among females than among males. 
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and annoyance, it is likely that differences in the extent of food aversions 
will become even smaller than at present. 

In our frankly theoretical interpretation we have not stressed sym- 
bolic and associative factors in the genesis of food aversions. There can 
be little question of their significance, especially for the understanding of 
particular dislikes, but they do not appear to be factors which could 
account for differences between the sexes. 


Summary 


By means of a check-list the food aversions of 308 females and 237 
males were determined. Comparisons between the sexes lead to the 
following conclusions: 

1. Considerable uniformity exists between the sexes in the extent to 
which various foods are disliked. 

2. For a small proportion of items, reliable differences exist in the 
extent to which males and females report aversions. 

3. In most cases where sex differences occur, a larger proportion of 
females than of males dislike the food. 

4. The differences found in our data can be accounted for by assuming 
social pressures exist which force males to repeat experiences with disliked 
foods but which permit females to retain habits of rejection. 
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Wren, Harold A. Vocational aspiration levels of adults. New York: 
Teachers College, Columbia University, Contributions to Education, 
No. 855, 1942. Pp. vi + 150. 


Vocational psychologists who see this book will probably expect to 
find a study in which dynamic material, such as that found in many 
laboratory studies of aspiration level, throws light on the practical prob- 
lems of vocational adjustment and job satisfaction. Certainly the field 
is ripe for a well-rounded study of the factors which determine the level 
of vocational aspiration, of the effects of achieving or failing to achieve 
that level, and of modes of adjustment to failure to realize one’s ambition. 

Wren’s study, however, is no such ambitious project. Modestly, and 
rather ploddingly, it confines itself to the first of the above topics. It 
attempts to ascertain the factors which fix the level of aspiration in a 
group of adults, by determining the social and psychological character- 
istics of those at each occupational status and aspiration level. The 
result is a limited but worthwhile contribution to our knowledge of the 
development and functioning of vocational ambitions. 

The subjects of the study were a random sample of the male clients 
of the Adjustment Service experiment in vocational counseling conducted 
in New York City in 1933. The data consisted of brief case records, 
each containing the results of a standard battery of tests, personal data 
blanks, and the counselor’s summary of his interview or interviews. 

The use of data obtained from subjects who had come to a counseling 
service for purposes not including this investigation limits the scope and 
value of the study and results in some findings, the oddness of which 
appears to have escaped the author. A summary on p. 37 states, for 
instance, that “‘there is a tendency for the workers on the higher levels 
to be older, to be married and support dependents, and to be more domi- 
nant than workers on the lower levels.”” Other studies have shown that 
there is some increase in age with occupational level, for the well-known 
reason that many younger men with education and ability start lower 
than their ultimate level and work up. And it is not surprising that an 
older group contains more married men with dependents. But, as the 
tendency in the general population is for number of dependents to be 
negatively correlated with occupational level, Wren would have done 
well to explore the significance and limitations of his finding that those 
on the higher levels have more dependents. Although the author points 
out, elsewhere, that his subjects are highly selected from the point of 

299 





——————————— ee ra een 


300 Book Reviews 


view of education and intelligence, he fails to relate these two sets of 
facts and to view his findings in the light of other research. This criti- 
cism applies to most of the study, for although a clear and concise job 
of reporting has been done, there is very little interpretation. 

The important findings of the study are presented in detail in three 
chapters. The first of these deals with the characteristics of those now 
at each occupational level. Wren confirms the finding of Davidson and I 
Anderson to the effect that employment stability is not correlated with 
occupational level; there appears to be as much vocational floundering 
at the higher levels as there is at the lower. Professional and managerial 
workers were more dominant than white collar workers, and these in turn 
were more dominant than manual workers. ‘The analysis is not suffi- 
ciently refined, however, to reveal whether this was a result of the fact 
that those on the lower levels were younger and as yet not established 
in vocational life, or whether it was a basic difference in personality 
which may explain their lower status. The tendency to work at or near 
the traditional family occupational level is confirmed by this study, but 
considerable social mobility existed none the less. General abilities such 
as intelligence and vocabulary increased with occupational level, and 
special abilities tended to be highest at the appropriate occupational 
levels. Education was correlated with occupational level, and, despite 
many exceptions, there was a tendency for those with special types of 
education to be employed in appropriate fields. 

The chapter on aspiration levels shows, as have other studies, that 
aspiration levels are higher than the level of actual employment, and 
that the two rise together. Marriage and a family did not affect aspira- 
tion level, but social dominance, family occupational level, education, 
and amount of general ability did. Those without mechanical ability or 
clerical ability tended to aspire to occupational levels at which such skills 
would not be important. 

A third significant chapter compares those who were employed at 
each level with those aspiring to each level. In each case those who 
aspired to a given level were inferior in most respects to those already 
employed at that level, but those with high ambitions tended to be 
superior to those on their own occupational level. 

Despite its limitations, Wren’s study makes a real contribution to 
our knowledge of the nature and operation of vocational ambitions. _ Its 
conclusions and implications should be known to all vocational psycholo- 
gists and counselors, as well as to others who work with youth. 
Donatp E. SuPer 
Captain, Air Corps 


HH Psychological Research Unit No. 1 
Nashville Army Air Center 
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